bsnes/higan/gba/ppu/background.cpp

307 lines
8.3 KiB
C++
Raw Normal View History

Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
uint3 PPU::Background::IO::mode;
uint1 PPU::Background::IO::frame;
uint5 PPU::Background::IO::mosaicWidth;
uint5 PPU::Background::IO::mosaicHeight;
auto PPU::Background::scanline(uint y) -> void {
}
auto PPU::Background::run(uint x, uint y) -> void {
output = {};
if(ppu.blank() || !io.enable) return;
switch(id) {
case PPU::BG0:
if(io.mode <= 1) return linear(x, y);
break;
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
case PPU::BG1:
if(io.mode <= 1) return linear(x, y);
break;
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
case PPU::BG2:
if(io.mode == 0) return linear(x, y);
if(io.mode <= 2) return affine(x, y);
if(io.mode <= 5) return bitmap(x, y);
break;
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
case PPU::BG3:
if(io.mode == 0) return linear(x, y);
if(io.mode == 2) return affine(x, y);
break;
}
}
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
/*
Update to v099r13 release. byuu says: Changelog: - GB core code cleanup completed - GBA core code cleanup completed - some more cleanup on missed processor/arm functions/variables - fixed FC loading icarus bug - "Load ROM File" icarus functionality restored - minor code unification efforts all around (not perfect yet) - MMIO->IO - mmio.cpp->io.cpp - read,write->readIO,writeIO It's been a very long work in progress ... starting all the way back with v094r09, but the major part of the higan code cleanup is now completed! Of course, it's very important to note that this is only for the basic style: - under_score functions and variables are now camelCase - return-type function-name() are now auto function-name() -> return-type - Natural<T>/Integer<T> replace (u)intT_n types where possible - signed/unsigned are now int/uint - most of the x==true,x==false tests changed to x,!x A lot of spot improvements to consistency, simplicity and quality have gone in along the way, of course. But we'll probably never fully finishing beautifying every last line of code in the entire codebase. Still, this is a really great start. Going forward, WIP diffs should start being smaller and of higher quality once again. I know the joke is, "until my coding style changes again", but ... this was way too stressful, way too time consuming, and way too risky. I'm too old and tired now for extreme upheavel like this again. The only major change I'm slowly mulling over would be renaming the using Natural<T>/Integer<T> = (u)intT; shorthand to something that isn't as easily confused with the (u)int_t types ... but we'll see. I'll definitely continue to change small things all the time, but for the larger picture, I need to just accept the style I have and live with it.
2016-06-29 11:10:28 +00:00
auto PPU::renderBackgroundLinear(Registers::Background& bg) -> void {
Update to v087r22 release. byuu says: Changelog: - fixed below pixel green channel on color blending - added semi-transparent objects [Exophase's method] - added full support for windows (both inputs, OBJ windows, and output, with optional color effect disable) - EEPROM uses nall::bitarray now to be friendlier to saving memory to disk - removed incomplete mosaic support for now (too broken, untested) - improved sprite priority. Hopefully it's right now. Just about everything should look great now. It took 25 days, but we finally have the BIOS rendering correctly. In order to do OBJ windows, I had to drop my above/below buffers entirely. I went with the nuclear option. There's separate layers for all BGs and objects. I build the OBJ window table during object rendering. So as a result, after rendering I go back and apply windows (and the object window that now exists.) After that, I have to do a painful Z-buffer select of the top two most important pixels. Since I now know the layers, the blending enable tests are a lot nicer, at least. But this obviously has quite a speed hit: 390fps to 325fps for Mr. Driller 2 title screen. TONC says that "bad" window coordinates do really insane things. GBAtek says it's a simple y2 < y1 || y2 > 160 ? 160 : y2; x2 < x1 || x2 > 240 ? 240 : x2; I like the GBAtek version more, so I went with that. I sure hope it's right ... but my guess is the hardware does this with a counter that wraps around or something. Also, say you have two OBJ mode 2 sprites that overlap each other, but with different priorities. The lower (more important) priority sprite has a clear pixel, but the higher priority sprite has a set pixel. Do we set the "inside OBJ window" flag to true here? Eg does the value OR, or does it hold the most important sprite's pixel value? Cydrak suspects it's OR-based, I concur from what I can see. Mosaic, I am at a loss. I really need a lot more information in order to implement it. For backgrounds, does it apply to the Vcounter of the entire screen? Or does it apply post-scroll? Or does it even apply after every adjust in affine/bitmap modes? I'm betting the hcounter background mosaic starts at the leftmost edge of the screen, and repeats previous pixels to apply the effect. Like SNES, very simple. For sprites, the SNES didn't have this. Does the mosaic grid start at (0,0) of the screen, or at (0,0) of each sprite? The latter will look a lot nicer, but be a lot more complex. Is mosaic on affine objects any different than mosaic of linear(tiled) objects? With that out of the way, we still have to fix the CPU memory access timing, add the rest of the CPU penalty cycles, the memory rotation / alignment / extend behavior needs to be fixed, the shifter desperately needs to be moved from loops to single shift operations, and I need to add flash memory support.
2012-04-13 11:49:32 +00:00
if(regs.control.enable[bg.id] == false) return;
auto& output = layer[bg.id];
if(bg.control.mosaic == false || (regs.vcounter % (1 + regs.mosaic.bgvsize)) == 0) {
bg.vmosaic = regs.vcounter;
}
uint9 voffset = bg.vmosaic + bg.voffset;
uint9 hoffset = bg.hoffset;
uint basemap = bg.control.screenbaseblock << 11;
uint basechr = bg.control.characterbaseblock << 14;
uint px = hoffset & 7, py = voffset & 7;
Tile tile;
uint8 data[8];
for(auto x : range(240)) {
if(x == 0 || px & 8) {
px &= 7;
uint tx = hoffset / 8, ty = voffset / 8;
uint offset = (ty & 31) * 32 + (tx & 31);
if(bg.control.screensize & 1) if(tx & 32) offset += 32 * 32;
if(bg.control.screensize & 2) if(ty & 32) offset += 32 * 32 * (1 + (bg.control.screensize & 1));
offset = basemap + offset * 2;
Update to v099r13 release. byuu says: Changelog: - GB core code cleanup completed - GBA core code cleanup completed - some more cleanup on missed processor/arm functions/variables - fixed FC loading icarus bug - "Load ROM File" icarus functionality restored - minor code unification efforts all around (not perfect yet) - MMIO->IO - mmio.cpp->io.cpp - read,write->readIO,writeIO It's been a very long work in progress ... starting all the way back with v094r09, but the major part of the higan code cleanup is now completed! Of course, it's very important to note that this is only for the basic style: - under_score functions and variables are now camelCase - return-type function-name() are now auto function-name() -> return-type - Natural<T>/Integer<T> replace (u)intT_n types where possible - signed/unsigned are now int/uint - most of the x==true,x==false tests changed to x,!x A lot of spot improvements to consistency, simplicity and quality have gone in along the way, of course. But we'll probably never fully finishing beautifying every last line of code in the entire codebase. Still, this is a really great start. Going forward, WIP diffs should start being smaller and of higher quality once again. I know the joke is, "until my coding style changes again", but ... this was way too stressful, way too time consuming, and way too risky. I'm too old and tired now for extreme upheavel like this again. The only major change I'm slowly mulling over would be renaming the using Natural<T>/Integer<T> = (u)intT; shorthand to something that isn't as easily confused with the (u)int_t types ... but we'll see. I'll definitely continue to change small things all the time, but for the larger picture, I need to just accept the style I have and live with it.
2016-06-29 11:10:28 +00:00
uint16 mapdata = readVRAM(Half, offset);
tile.character = mapdata >> 0;
tile.hflip = mapdata >> 10;
tile.vflip = mapdata >> 11;
tile.palette = mapdata >> 12;
if(bg.control.colormode == 0) {
offset = basechr + tile.character * 32 + (py ^ (tile.vflip ? 7 : 0)) * 4;
Update to v099r13 release. byuu says: Changelog: - GB core code cleanup completed - GBA core code cleanup completed - some more cleanup on missed processor/arm functions/variables - fixed FC loading icarus bug - "Load ROM File" icarus functionality restored - minor code unification efforts all around (not perfect yet) - MMIO->IO - mmio.cpp->io.cpp - read,write->readIO,writeIO It's been a very long work in progress ... starting all the way back with v094r09, but the major part of the higan code cleanup is now completed! Of course, it's very important to note that this is only for the basic style: - under_score functions and variables are now camelCase - return-type function-name() are now auto function-name() -> return-type - Natural<T>/Integer<T> replace (u)intT_n types where possible - signed/unsigned are now int/uint - most of the x==true,x==false tests changed to x,!x A lot of spot improvements to consistency, simplicity and quality have gone in along the way, of course. But we'll probably never fully finishing beautifying every last line of code in the entire codebase. Still, this is a really great start. Going forward, WIP diffs should start being smaller and of higher quality once again. I know the joke is, "until my coding style changes again", but ... this was way too stressful, way too time consuming, and way too risky. I'm too old and tired now for extreme upheavel like this again. The only major change I'm slowly mulling over would be renaming the using Natural<T>/Integer<T> = (u)intT; shorthand to something that isn't as easily confused with the (u)int_t types ... but we'll see. I'll definitely continue to change small things all the time, but for the larger picture, I need to just accept the style I have and live with it.
2016-06-29 11:10:28 +00:00
uint32 word = readVRAM(Word, offset);
for(auto n : range(8)) data[n] = (word >> (n * 4)) & 15;
} else {
offset = basechr + tile.character * 64 + (py ^ (tile.vflip ? 7 : 0)) * 8;
Update to v099r13 release. byuu says: Changelog: - GB core code cleanup completed - GBA core code cleanup completed - some more cleanup on missed processor/arm functions/variables - fixed FC loading icarus bug - "Load ROM File" icarus functionality restored - minor code unification efforts all around (not perfect yet) - MMIO->IO - mmio.cpp->io.cpp - read,write->readIO,writeIO It's been a very long work in progress ... starting all the way back with v094r09, but the major part of the higan code cleanup is now completed! Of course, it's very important to note that this is only for the basic style: - under_score functions and variables are now camelCase - return-type function-name() are now auto function-name() -> return-type - Natural<T>/Integer<T> replace (u)intT_n types where possible - signed/unsigned are now int/uint - most of the x==true,x==false tests changed to x,!x A lot of spot improvements to consistency, simplicity and quality have gone in along the way, of course. But we'll probably never fully finishing beautifying every last line of code in the entire codebase. Still, this is a really great start. Going forward, WIP diffs should start being smaller and of higher quality once again. I know the joke is, "until my coding style changes again", but ... this was way too stressful, way too time consuming, and way too risky. I'm too old and tired now for extreme upheavel like this again. The only major change I'm slowly mulling over would be renaming the using Natural<T>/Integer<T> = (u)intT; shorthand to something that isn't as easily confused with the (u)int_t types ... but we'll see. I'll definitely continue to change small things all the time, but for the larger picture, I need to just accept the style I have and live with it.
2016-06-29 11:10:28 +00:00
uint32 wordlo = readVRAM(Word, offset + 0);
uint32 wordhi = readVRAM(Word, offset + 4);
for(auto n : range(4)) data[0 + n] = (wordlo >> (n * 8)) & 255;
for(auto n : range(4)) data[4 + n] = (wordhi >> (n * 8)) & 255;
}
}
hoffset++;
uint8 color = data[px++ ^ (tile.hflip ? 7 : 0)];
if(color) {
Update to v087r28 release. byuu says: Be sure to run make install, and move required images to their appropriate system profile folders. I still have no warnings in place if those images aren't present. Changelog: - OBJ mosaic should hopefully be emulated correctly now (thanks to krom and Cydrak for testing the hardware behavior) - emulated dummy serial registers, fixes Sonic Advance (you may still need to specify 512KB FlashROM with an appropriate ID, I used Panaonic's) - GBA core exits scheduler (PPU thread) and calls interface->videoRefresh() from main thread (not required, just nice) - SRAM, FRAM, EEPROM and FlashROM initialized to 0xFF if it does not exist (probably not needed, but FlashROM likes to reset to 0xFF anyway) - GBA manifest.xml for file-mode will now use "gamename.xml" instead of "gamename.gba.xml" - started renaming "NES" to "Famicom" and "SNES" to "Super Famicom" in the GUI (may or may not change source code in the long-term) - removed target-libsnes/ - added profile/ Profiles are the major new feature. So far we have: Famicom.sys/{nothing (yet?)} Super Famicom.sys/{ipl.rom} Game Boy.sys/{boot.rom} Game Boy Color.sys/{boot.rom} Game Boy Advance.sys/{bios.rom[not included]} Super Game Boy.sfc/{boot.rom,program.rom[not included]} BS-X Satellaview.sfc/{program.rom,bsx.ram,bsx.pram} Sufami Turbo.sfc/{program.rom} The SGB, BSX and ST cartridges ask you to load GB, BS or ST cartridges directly now. No slot loader for them. So the obvious downsides: you can't quickly pick between different SGB BIOSes, but why would you want to? Just use SGB2/JP. It's still possible, so I'll sacrifice a little complexity for a rare case to make it a lot easier for the more common case. ST cartridges currently won't let you load the secondary slot. BS-X Town cart is the only useful game to load with nothing in the slot, but only barely, since games are all seeded on flash and not on PSRAM images. We can revisit a way to boot the BIOS directly if and when we get the satellite uplink emulated and data can be downloaded onto the PSRAM :P BS-X slotted cartridges still require the secondary slot. My plan for BS-X slotted cartridges is to require a manifest.xml to specify that it has the BS-X slot present. Otherwise, we have to load the ROM into the SNES cartridge class, and parse its header before we can find out if it has one. Screw that. If it's in the XML, I can tell before loading the ROM if I need to present you with an optional slot loading dialog. I will probably do something similar for Sufami Turbo. Not all games even work with a secondary slot, so why ask you to load a second slot for them? Let the XML request a second slot. A complete Sufami Turbo ROM set will be trivial anyway. Not sure how I want to do the sub dialog yet. We want basic file loading, but we don't want it to look like the dialog 'didn't do anything' if it pops back open immediately again. Maybe change the background color of the dialog to a darker gray? Tacky, but it'd give you the visual cue without the need for some subtle text changes.
2012-04-18 13:58:04 +00:00
if(bg.control.colormode == 0) output[x].write(true, bg.control.priority, pram[tile.palette * 16 + color]);
if(bg.control.colormode == 1) output[x].write(true, bg.control.priority, pram[color]);
}
}
}
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
*/
auto PPU::Background::linear(uint x, uint y) -> void {
if(x == 0) {
if(!io.mosaic || (y % (1 + io.mosaicHeight)) == 0) {
vmosaic = y;
}
voffset = vmosaic + io.voffset;
hoffset = io.hoffset;
}
uint px = hoffset & 7;
uint py = voffset & 7;
uint tx = hoffset >> 3;
uint ty = voffset >> 3;
uint offset = (ty & 31) * 32 + (tx & 31);
if(io.screenSize.bit(0) && (tx & 32)) offset += 32 * 32;
if(io.screenSize.bit(1) && (ty & 32)) offset += 32 * 32 * (1 + (io.screenSize.bit(0)));
offset = (io.screenBase << 11) + offset * 2;
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
uint16 tilemap = ppu.readVRAM(Half, offset);
uint10 character = tilemap.bits( 0, 9);
uint1 hflip = tilemap.bit (10);
uint1 vflip = tilemap.bit (11);
uint4 palette = tilemap.bits(12,15);
if(io.colorMode == 0) {
offset = (io.characterBase << 14) + character * 32;
offset += (py ^ (vflip ? 7 : 0)) * 4;
offset += (px ^ (hflip ? 7 : 0)) / 2;
if(uint4 color = ppu.readVRAM(Byte, offset) >> (px & 1 ? 4 : 0)) {
output.enable = true;
output.priority = io.priority;
output.color = ppu.pram[palette * 16 + color];
}
} else {
offset = (io.characterBase << 14) + character * 64;
offset += (py ^ (vflip ? 7 : 0)) * 8;
offset += (px ^ (hflip ? 7 : 0)) / 1;
if(uint8 color = ppu.readVRAM(Byte, offset)) {
output.enable = true;
output.priority = io.priority;
output.color = ppu.pram[color];
}
}
hoffset++;
}
/*
Update to v099r13 release. byuu says: Changelog: - GB core code cleanup completed - GBA core code cleanup completed - some more cleanup on missed processor/arm functions/variables - fixed FC loading icarus bug - "Load ROM File" icarus functionality restored - minor code unification efforts all around (not perfect yet) - MMIO->IO - mmio.cpp->io.cpp - read,write->readIO,writeIO It's been a very long work in progress ... starting all the way back with v094r09, but the major part of the higan code cleanup is now completed! Of course, it's very important to note that this is only for the basic style: - under_score functions and variables are now camelCase - return-type function-name() are now auto function-name() -> return-type - Natural<T>/Integer<T> replace (u)intT_n types where possible - signed/unsigned are now int/uint - most of the x==true,x==false tests changed to x,!x A lot of spot improvements to consistency, simplicity and quality have gone in along the way, of course. But we'll probably never fully finishing beautifying every last line of code in the entire codebase. Still, this is a really great start. Going forward, WIP diffs should start being smaller and of higher quality once again. I know the joke is, "until my coding style changes again", but ... this was way too stressful, way too time consuming, and way too risky. I'm too old and tired now for extreme upheavel like this again. The only major change I'm slowly mulling over would be renaming the using Natural<T>/Integer<T> = (u)intT; shorthand to something that isn't as easily confused with the (u)int_t types ... but we'll see. I'll definitely continue to change small things all the time, but for the larger picture, I need to just accept the style I have and live with it.
2016-06-29 11:10:28 +00:00
auto PPU::renderBackgroundAffine(Registers::Background& bg) -> void {
Update to v087r22 release. byuu says: Changelog: - fixed below pixel green channel on color blending - added semi-transparent objects [Exophase's method] - added full support for windows (both inputs, OBJ windows, and output, with optional color effect disable) - EEPROM uses nall::bitarray now to be friendlier to saving memory to disk - removed incomplete mosaic support for now (too broken, untested) - improved sprite priority. Hopefully it's right now. Just about everything should look great now. It took 25 days, but we finally have the BIOS rendering correctly. In order to do OBJ windows, I had to drop my above/below buffers entirely. I went with the nuclear option. There's separate layers for all BGs and objects. I build the OBJ window table during object rendering. So as a result, after rendering I go back and apply windows (and the object window that now exists.) After that, I have to do a painful Z-buffer select of the top two most important pixels. Since I now know the layers, the blending enable tests are a lot nicer, at least. But this obviously has quite a speed hit: 390fps to 325fps for Mr. Driller 2 title screen. TONC says that "bad" window coordinates do really insane things. GBAtek says it's a simple y2 < y1 || y2 > 160 ? 160 : y2; x2 < x1 || x2 > 240 ? 240 : x2; I like the GBAtek version more, so I went with that. I sure hope it's right ... but my guess is the hardware does this with a counter that wraps around or something. Also, say you have two OBJ mode 2 sprites that overlap each other, but with different priorities. The lower (more important) priority sprite has a clear pixel, but the higher priority sprite has a set pixel. Do we set the "inside OBJ window" flag to true here? Eg does the value OR, or does it hold the most important sprite's pixel value? Cydrak suspects it's OR-based, I concur from what I can see. Mosaic, I am at a loss. I really need a lot more information in order to implement it. For backgrounds, does it apply to the Vcounter of the entire screen? Or does it apply post-scroll? Or does it even apply after every adjust in affine/bitmap modes? I'm betting the hcounter background mosaic starts at the leftmost edge of the screen, and repeats previous pixels to apply the effect. Like SNES, very simple. For sprites, the SNES didn't have this. Does the mosaic grid start at (0,0) of the screen, or at (0,0) of each sprite? The latter will look a lot nicer, but be a lot more complex. Is mosaic on affine objects any different than mosaic of linear(tiled) objects? With that out of the way, we still have to fix the CPU memory access timing, add the rest of the CPU penalty cycles, the memory rotation / alignment / extend behavior needs to be fixed, the shifter desperately needs to be moved from loops to single shift operations, and I need to add flash memory support.
2012-04-13 11:49:32 +00:00
if(regs.control.enable[bg.id] == false) return;
auto& output = layer[bg.id];
uint basemap = bg.control.screenbaseblock << 11;
uint basechr = bg.control.characterbaseblock << 14;
uint screensize = 16 << bg.control.screensize;
uint screenwrap = (1 << (bg.control.affinewrap ? 7 + bg.control.screensize : 20)) - 1;
if(bg.control.mosaic == false || (regs.vcounter % (1 + regs.mosaic.bgvsize)) == 0) {
bg.hmosaic = bg.lx;
bg.vmosaic = bg.ly;
}
int28 fx = bg.hmosaic;
int28 fy = bg.vmosaic;
for(auto x : range(240)) {
uint cx = (fx >> 8) & screenwrap, tx = cx / 8, px = cx & 7;
uint cy = (fy >> 8) & screenwrap, ty = cy / 8, py = cy & 7;
if(tx < screensize && ty < screensize) {
uint8 character = vram[basemap + ty * screensize + tx];
uint8 color = vram[basechr + (character * 64) + py * 8 + px];
Update to v087r28 release. byuu says: Be sure to run make install, and move required images to their appropriate system profile folders. I still have no warnings in place if those images aren't present. Changelog: - OBJ mosaic should hopefully be emulated correctly now (thanks to krom and Cydrak for testing the hardware behavior) - emulated dummy serial registers, fixes Sonic Advance (you may still need to specify 512KB FlashROM with an appropriate ID, I used Panaonic's) - GBA core exits scheduler (PPU thread) and calls interface->videoRefresh() from main thread (not required, just nice) - SRAM, FRAM, EEPROM and FlashROM initialized to 0xFF if it does not exist (probably not needed, but FlashROM likes to reset to 0xFF anyway) - GBA manifest.xml for file-mode will now use "gamename.xml" instead of "gamename.gba.xml" - started renaming "NES" to "Famicom" and "SNES" to "Super Famicom" in the GUI (may or may not change source code in the long-term) - removed target-libsnes/ - added profile/ Profiles are the major new feature. So far we have: Famicom.sys/{nothing (yet?)} Super Famicom.sys/{ipl.rom} Game Boy.sys/{boot.rom} Game Boy Color.sys/{boot.rom} Game Boy Advance.sys/{bios.rom[not included]} Super Game Boy.sfc/{boot.rom,program.rom[not included]} BS-X Satellaview.sfc/{program.rom,bsx.ram,bsx.pram} Sufami Turbo.sfc/{program.rom} The SGB, BSX and ST cartridges ask you to load GB, BS or ST cartridges directly now. No slot loader for them. So the obvious downsides: you can't quickly pick between different SGB BIOSes, but why would you want to? Just use SGB2/JP. It's still possible, so I'll sacrifice a little complexity for a rare case to make it a lot easier for the more common case. ST cartridges currently won't let you load the secondary slot. BS-X Town cart is the only useful game to load with nothing in the slot, but only barely, since games are all seeded on flash and not on PSRAM images. We can revisit a way to boot the BIOS directly if and when we get the satellite uplink emulated and data can be downloaded onto the PSRAM :P BS-X slotted cartridges still require the secondary slot. My plan for BS-X slotted cartridges is to require a manifest.xml to specify that it has the BS-X slot present. Otherwise, we have to load the ROM into the SNES cartridge class, and parse its header before we can find out if it has one. Screw that. If it's in the XML, I can tell before loading the ROM if I need to present you with an optional slot loading dialog. I will probably do something similar for Sufami Turbo. Not all games even work with a secondary slot, so why ask you to load a second slot for them? Let the XML request a second slot. A complete Sufami Turbo ROM set will be trivial anyway. Not sure how I want to do the sub dialog yet. We want basic file loading, but we don't want it to look like the dialog 'didn't do anything' if it pops back open immediately again. Maybe change the background color of the dialog to a darker gray? Tacky, but it'd give you the visual cue without the need for some subtle text changes.
2012-04-18 13:58:04 +00:00
if(color) output[x].write(true, bg.control.priority, pram[color]);
}
fx += bg.pa;
fy += bg.pc;
}
bg.lx += bg.pb;
bg.ly += bg.pd;
}
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
*/
auto PPU::Background::affine(uint x, uint y) -> void {
if(x == 0) {
if(!io.mosaic || (y % (1 + io.mosaicHeight)) == 0) {
hmosaic = io.lx;
vmosaic = io.ly;
}
fx = hmosaic;
fy = vmosaic;
}
uint screenSize = 16 << io.screenSize;
uint screenWrap = (1 << (io.affineWrap ? 7 + io.screenSize : 20)) - 1;
uint cx = (fx >> 8) & screenWrap, tx = cx >> 3, px = cx & 7;
uint cy = (fy >> 8) & screenWrap, ty = cy >> 3, py = cy & 7;
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
if(tx < screenSize && ty < screenSize) {
uint8 character = ppu.vram[(io.screenBase << 11) + ty * screenSize + tx];
if(uint8 color = ppu.vram[(io.characterBase << 14) + character * 64 + py * 8 + px]) {
output.enable = true;
output.priority = io.priority;
output.color = ppu.pram[color];
}
}
fx += io.pa;
fy += io.pc;
if(x == 239) {
io.lx += io.pb;
io.ly += io.pd;
}
}
/*
Update to v099r13 release. byuu says: Changelog: - GB core code cleanup completed - GBA core code cleanup completed - some more cleanup on missed processor/arm functions/variables - fixed FC loading icarus bug - "Load ROM File" icarus functionality restored - minor code unification efforts all around (not perfect yet) - MMIO->IO - mmio.cpp->io.cpp - read,write->readIO,writeIO It's been a very long work in progress ... starting all the way back with v094r09, but the major part of the higan code cleanup is now completed! Of course, it's very important to note that this is only for the basic style: - under_score functions and variables are now camelCase - return-type function-name() are now auto function-name() -> return-type - Natural<T>/Integer<T> replace (u)intT_n types where possible - signed/unsigned are now int/uint - most of the x==true,x==false tests changed to x,!x A lot of spot improvements to consistency, simplicity and quality have gone in along the way, of course. But we'll probably never fully finishing beautifying every last line of code in the entire codebase. Still, this is a really great start. Going forward, WIP diffs should start being smaller and of higher quality once again. I know the joke is, "until my coding style changes again", but ... this was way too stressful, way too time consuming, and way too risky. I'm too old and tired now for extreme upheavel like this again. The only major change I'm slowly mulling over would be renaming the using Natural<T>/Integer<T> = (u)intT; shorthand to something that isn't as easily confused with the (u)int_t types ... but we'll see. I'll definitely continue to change small things all the time, but for the larger picture, I need to just accept the style I have and live with it.
2016-06-29 11:10:28 +00:00
auto PPU::renderBackgroundBitmap(Registers::Background& bg) -> void {
Update to v087r22 release. byuu says: Changelog: - fixed below pixel green channel on color blending - added semi-transparent objects [Exophase's method] - added full support for windows (both inputs, OBJ windows, and output, with optional color effect disable) - EEPROM uses nall::bitarray now to be friendlier to saving memory to disk - removed incomplete mosaic support for now (too broken, untested) - improved sprite priority. Hopefully it's right now. Just about everything should look great now. It took 25 days, but we finally have the BIOS rendering correctly. In order to do OBJ windows, I had to drop my above/below buffers entirely. I went with the nuclear option. There's separate layers for all BGs and objects. I build the OBJ window table during object rendering. So as a result, after rendering I go back and apply windows (and the object window that now exists.) After that, I have to do a painful Z-buffer select of the top two most important pixels. Since I now know the layers, the blending enable tests are a lot nicer, at least. But this obviously has quite a speed hit: 390fps to 325fps for Mr. Driller 2 title screen. TONC says that "bad" window coordinates do really insane things. GBAtek says it's a simple y2 < y1 || y2 > 160 ? 160 : y2; x2 < x1 || x2 > 240 ? 240 : x2; I like the GBAtek version more, so I went with that. I sure hope it's right ... but my guess is the hardware does this with a counter that wraps around or something. Also, say you have two OBJ mode 2 sprites that overlap each other, but with different priorities. The lower (more important) priority sprite has a clear pixel, but the higher priority sprite has a set pixel. Do we set the "inside OBJ window" flag to true here? Eg does the value OR, or does it hold the most important sprite's pixel value? Cydrak suspects it's OR-based, I concur from what I can see. Mosaic, I am at a loss. I really need a lot more information in order to implement it. For backgrounds, does it apply to the Vcounter of the entire screen? Or does it apply post-scroll? Or does it even apply after every adjust in affine/bitmap modes? I'm betting the hcounter background mosaic starts at the leftmost edge of the screen, and repeats previous pixels to apply the effect. Like SNES, very simple. For sprites, the SNES didn't have this. Does the mosaic grid start at (0,0) of the screen, or at (0,0) of each sprite? The latter will look a lot nicer, but be a lot more complex. Is mosaic on affine objects any different than mosaic of linear(tiled) objects? With that out of the way, we still have to fix the CPU memory access timing, add the rest of the CPU penalty cycles, the memory rotation / alignment / extend behavior needs to be fixed, the shifter desperately needs to be moved from loops to single shift operations, and I need to add flash memory support.
2012-04-13 11:49:32 +00:00
if(regs.control.enable[bg.id] == false) return;
auto& output = layer[bg.id];
uint1 depth = regs.control.bgmode != 4; //0 = 8-bit (Mode 4), 1 = 15-bit (Mode 3, Mode 5)
uint basemap = regs.control.bgmode == 3 ? 0 : 0xa000 * regs.control.frame;
uint width = regs.control.bgmode == 5 ? 160 : 240;
uint height = regs.control.bgmode == 5 ? 128 : 160;
uint mode = depth ? Half : Byte;
if(bg.control.mosaic == false || (regs.vcounter % (1 + regs.mosaic.bgvsize)) == 0) {
bg.hmosaic = bg.lx;
bg.vmosaic = bg.ly;
}
int28 fx = bg.hmosaic;
int28 fy = bg.vmosaic;
for(auto x : range(240)) {
uint px = fx >> 8;
uint py = fy >> 8;
if(px < width && py < height) {
uint offset = py * width + px;
Update to v099r13 release. byuu says: Changelog: - GB core code cleanup completed - GBA core code cleanup completed - some more cleanup on missed processor/arm functions/variables - fixed FC loading icarus bug - "Load ROM File" icarus functionality restored - minor code unification efforts all around (not perfect yet) - MMIO->IO - mmio.cpp->io.cpp - read,write->readIO,writeIO It's been a very long work in progress ... starting all the way back with v094r09, but the major part of the higan code cleanup is now completed! Of course, it's very important to note that this is only for the basic style: - under_score functions and variables are now camelCase - return-type function-name() are now auto function-name() -> return-type - Natural<T>/Integer<T> replace (u)intT_n types where possible - signed/unsigned are now int/uint - most of the x==true,x==false tests changed to x,!x A lot of spot improvements to consistency, simplicity and quality have gone in along the way, of course. But we'll probably never fully finishing beautifying every last line of code in the entire codebase. Still, this is a really great start. Going forward, WIP diffs should start being smaller and of higher quality once again. I know the joke is, "until my coding style changes again", but ... this was way too stressful, way too time consuming, and way too risky. I'm too old and tired now for extreme upheavel like this again. The only major change I'm slowly mulling over would be renaming the using Natural<T>/Integer<T> = (u)intT; shorthand to something that isn't as easily confused with the (u)int_t types ... but we'll see. I'll definitely continue to change small things all the time, but for the larger picture, I need to just accept the style I have and live with it.
2016-06-29 11:10:28 +00:00
uint color = readVRAM(mode, basemap + (offset << depth));
if(depth || color) { //8bpp color 0 is transparent; 15bpp color is always opaque
if(depth == 0) color = pram[color];
if(depth == 1) color = color & 0x7fff;
Update to v087r28 release. byuu says: Be sure to run make install, and move required images to their appropriate system profile folders. I still have no warnings in place if those images aren't present. Changelog: - OBJ mosaic should hopefully be emulated correctly now (thanks to krom and Cydrak for testing the hardware behavior) - emulated dummy serial registers, fixes Sonic Advance (you may still need to specify 512KB FlashROM with an appropriate ID, I used Panaonic's) - GBA core exits scheduler (PPU thread) and calls interface->videoRefresh() from main thread (not required, just nice) - SRAM, FRAM, EEPROM and FlashROM initialized to 0xFF if it does not exist (probably not needed, but FlashROM likes to reset to 0xFF anyway) - GBA manifest.xml for file-mode will now use "gamename.xml" instead of "gamename.gba.xml" - started renaming "NES" to "Famicom" and "SNES" to "Super Famicom" in the GUI (may or may not change source code in the long-term) - removed target-libsnes/ - added profile/ Profiles are the major new feature. So far we have: Famicom.sys/{nothing (yet?)} Super Famicom.sys/{ipl.rom} Game Boy.sys/{boot.rom} Game Boy Color.sys/{boot.rom} Game Boy Advance.sys/{bios.rom[not included]} Super Game Boy.sfc/{boot.rom,program.rom[not included]} BS-X Satellaview.sfc/{program.rom,bsx.ram,bsx.pram} Sufami Turbo.sfc/{program.rom} The SGB, BSX and ST cartridges ask you to load GB, BS or ST cartridges directly now. No slot loader for them. So the obvious downsides: you can't quickly pick between different SGB BIOSes, but why would you want to? Just use SGB2/JP. It's still possible, so I'll sacrifice a little complexity for a rare case to make it a lot easier for the more common case. ST cartridges currently won't let you load the secondary slot. BS-X Town cart is the only useful game to load with nothing in the slot, but only barely, since games are all seeded on flash and not on PSRAM images. We can revisit a way to boot the BIOS directly if and when we get the satellite uplink emulated and data can be downloaded onto the PSRAM :P BS-X slotted cartridges still require the secondary slot. My plan for BS-X slotted cartridges is to require a manifest.xml to specify that it has the BS-X slot present. Otherwise, we have to load the ROM into the SNES cartridge class, and parse its header before we can find out if it has one. Screw that. If it's in the XML, I can tell before loading the ROM if I need to present you with an optional slot loading dialog. I will probably do something similar for Sufami Turbo. Not all games even work with a secondary slot, so why ask you to load a second slot for them? Let the XML request a second slot. A complete Sufami Turbo ROM set will be trivial anyway. Not sure how I want to do the sub dialog yet. We want basic file loading, but we don't want it to look like the dialog 'didn't do anything' if it pops back open immediately again. Maybe change the background color of the dialog to a darker gray? Tacky, but it'd give you the visual cue without the need for some subtle text changes.
2012-04-18 13:58:04 +00:00
output[x].write(true, bg.control.priority, color);
}
}
fx += bg.pa;
fy += bg.pc;
}
bg.lx += bg.pb;
bg.ly += bg.pd;
}
Update to v102r19 release. byuu says: Note: add `#undef OUT` to the top of higan/gba/ppu/ppu.hpp to compile on Windows (ugh ...) Now to await posts about this in four more threads again ;) Changelog: - GBA: rewrote PPU from a scanline-based renderer to a pixel-based renderer - ruby: fixed video/gdi bugs Note that there's an approximately 21% speed penalty compared to v102r18 for the pixel-based renderer. Also, horizontal mosaic effects are not yet implemented. But they should be prior to v103. This one is a little tricky as it currently works on fully rendered scanlines. I need to roll the mosaic into the background renderers, and then for sprites, well ... see below. The trickiest part by far of this new renderer is the object (sprite) system. Unlike every other system I emulate, the GBA supports affine rendering of its sprites. Or in other words, rotation effects. And it also has a very complex priority system. Right now, I can't see any way that the GBA PPU could render pixels in real-time like this. My belief is that there's a 240-entry buffer that fills up the next scanline's row of pixels. Which means it probably also runs on the last scanline of Vblank so that the first scanline has sprite data. However, I didn't design my object renderer like this just yet. For now, it creates a buffer of all 240 pixels right away at the start of the scanline. I know\!\! That's technically scanline-based. But it's only for fetching object tiledata, and it's only temporary. What needs to happen is I need a way to run something like a "mini libco thread" inside of the main thread, so that the object renderer can run in parallel with the rest of the PPU, yet not be a hideous abomination of a state machine, yet also not be horrendously slow as a full libco thread would be. I'm envisioning some kind of stackless yielding coroutine. But I'll need to think through how to design that, given the absence of coroutines even in C++17.
2017-06-04 03:16:44 +00:00
*/
auto PPU::Background::bitmap(uint x, uint y) -> void {
if(x == 0) {
if(!io.mosaic || (y % (1 + io.mosaicHeight)) == 0) {
hmosaic = io.lx;
vmosaic = io.ly;
}
fx = hmosaic;
fy = vmosaic;
}
uint1 depth = io.mode != 4; //0 = 8-bit (mode 4); 1 = 15-bit (mode 3, mode 5)
uint width = io.mode == 5 ? 160 : 240;
uint height = io.mode == 5 ? 128 : 160;
uint mode = depth ? Half : Byte;
uint baseAddress = io.mode == 3 ? 0 : 0xa000 * io.frame;
uint px = fx >> 8;
uint py = fy >> 8;
if(px < width && py < height) {
uint offset = py * width + px;
uint15 color = ppu.readVRAM(mode, baseAddress + (offset << depth));
if(depth || color) { //8bpp color 0 is transparent; 15bpp color is always opaque
if(depth == 0) color = ppu.pram[color];
output.enable = true;
output.priority = io.priority;
output.color = color;
}
}
fx += io.pa;
fy += io.pc;
if(x == 239) {
io.lx += io.pb;
io.ly += io.pd;
}
}
auto PPU::Background::power(uint id) -> void {
this->id = id;
memory::fill(&io, sizeof(IO));
}