2012-04-09 06:19:32 +00:00
|
|
|
void PPU::render_forceblank() {
|
Update to v088r03 release.
byuu says:
static vector<uint8_t> file::read(const string &filename); replaces:
static bool file::read(const string &filename, uint8_t *&data, unsigned
&size); This allows automatic deletion of the underlying data.
Added vectorstream, which is obviously a vector<uint8_t> wrapper for
a data stream. Plan is for all data accesses inside my emulation cores
to take stream objects, especially MSU1. This lets you feed the core
anything: memorystream, filestream, zipstream, gzipstream, httpstream,
etc. There will still be exceptions for link and serial, those need
actual library files on disk. But those aren't official hardware devices
anyway.
So to help with speed a bit, I'm rethinking the video rendering path.
Previous system:
- core outputs system-native samples (SNES = 19-bit LRGB, NES = 9-bit
emphasis+palette, DMG = 2-bit grayscale, etc.)
- interfaceSystem transforms samples to 30-bit via lookup table inside
the emulation core
- interfaceSystem masks off overscan areas, if enabled
- interfaceUI runs filter to produce new target buffer, if enabled
- interfaceUI transforms 30-bit video to native display depth (24-bit or
30-bit), and applies color-adjustments (gamma, etc) at the same time
New system:
- all cores now generate an internal palette, and call
Interface::videoColor(uint32_t source, uint16_t red, uint16_t green,
uint16_t blue) to get native display color post-adjusted (gamma, etc
applied already.)
- all cores output to uint32_t* buffer now (output video.palette[color]
instead of just color)
- interfaceUI runs filter to produce new target buffer, if enabled
- interfaceUI memcpy()'s buffer to the video card
videoColor() is pretty neat. source is the raw pixel (as per the
old-format, 19-bit SNES, 9-bit NES, etc), and you can create a color
from that if you really want to. Or return that value to get a buffer
just like v088 and below. red, green, blue are 16-bits per channel,
because why the hell not, right? Just lop off all the bits you don't
want. If you have more bits on your display than that, fuck you :P
The last step is extremely difficult to avoid. Video cards can and do
have pitches that differ from the width of the texture. Trying to make
the core account for this would be really awful. And even if we did
that, the emulation routine would need to write directly to a video card
RAM buffer. Some APIs require you to lock the video buffer while
writing, so this would leave the video buffer locked for a long time.
Probably not catastrophic, but still awful. And lastly, if the
emulation core tried writing directly to the display texture, software
filters would no longer be possible (unless you -really- jump through
hooks and divert to a memory buffer when a filter is enabled, but ...
fuck.)
Anyway, the point of all that work was to eliminate an extra video copy,
and the need for a really painful 30-bit to 24-bit conversion (three
shifts, three masks, three array indexes.) So this basically reverts us,
performance-wise, to where we were pre-30 bit support.
[...]
The downside to this is that we're going to need a filter for each
output depth. Since the array type is uint32_t*, and I don't intend to
support higher or lower depths, we really only need 24+30-bit versions
of each filter. Kinda shitty, but oh well.
2012-04-27 12:12:53 +00:00
|
|
|
uint32 *line = output + regs.vcounter * 240;
|
2012-04-10 11:41:35 +00:00
|
|
|
uint16 *last = blur + regs.vcounter * 240;
|
|
|
|
for(unsigned x = 0; x < 240; x++) {
|
Update to v088r03 release.
byuu says:
static vector<uint8_t> file::read(const string &filename); replaces:
static bool file::read(const string &filename, uint8_t *&data, unsigned
&size); This allows automatic deletion of the underlying data.
Added vectorstream, which is obviously a vector<uint8_t> wrapper for
a data stream. Plan is for all data accesses inside my emulation cores
to take stream objects, especially MSU1. This lets you feed the core
anything: memorystream, filestream, zipstream, gzipstream, httpstream,
etc. There will still be exceptions for link and serial, those need
actual library files on disk. But those aren't official hardware devices
anyway.
So to help with speed a bit, I'm rethinking the video rendering path.
Previous system:
- core outputs system-native samples (SNES = 19-bit LRGB, NES = 9-bit
emphasis+palette, DMG = 2-bit grayscale, etc.)
- interfaceSystem transforms samples to 30-bit via lookup table inside
the emulation core
- interfaceSystem masks off overscan areas, if enabled
- interfaceUI runs filter to produce new target buffer, if enabled
- interfaceUI transforms 30-bit video to native display depth (24-bit or
30-bit), and applies color-adjustments (gamma, etc) at the same time
New system:
- all cores now generate an internal palette, and call
Interface::videoColor(uint32_t source, uint16_t red, uint16_t green,
uint16_t blue) to get native display color post-adjusted (gamma, etc
applied already.)
- all cores output to uint32_t* buffer now (output video.palette[color]
instead of just color)
- interfaceUI runs filter to produce new target buffer, if enabled
- interfaceUI memcpy()'s buffer to the video card
videoColor() is pretty neat. source is the raw pixel (as per the
old-format, 19-bit SNES, 9-bit NES, etc), and you can create a color
from that if you really want to. Or return that value to get a buffer
just like v088 and below. red, green, blue are 16-bits per channel,
because why the hell not, right? Just lop off all the bits you don't
want. If you have more bits on your display than that, fuck you :P
The last step is extremely difficult to avoid. Video cards can and do
have pitches that differ from the width of the texture. Trying to make
the core account for this would be really awful. And even if we did
that, the emulation routine would need to write directly to a video card
RAM buffer. Some APIs require you to lock the video buffer while
writing, so this would leave the video buffer locked for a long time.
Probably not catastrophic, but still awful. And lastly, if the
emulation core tried writing directly to the display texture, software
filters would no longer be possible (unless you -really- jump through
hooks and divert to a memory buffer when a filter is enabled, but ...
fuck.)
Anyway, the point of all that work was to eliminate an extra video copy,
and the need for a really painful 30-bit to 24-bit conversion (three
shifts, three masks, three array indexes.) So this basically reverts us,
performance-wise, to where we were pre-30 bit support.
[...]
The downside to this is that we're going to need a filter for each
output depth. Since the array type is uint32_t*, and I don't intend to
support higher or lower depths, we really only need 24+30-bit versions
of each filter. Kinda shitty, but oh well.
2012-04-27 12:12:53 +00:00
|
|
|
line[x] = video.palette[(0x7fff + last[x] - ((0x7fff ^ last[x]) & 0x0421)) >> 1];
|
2012-04-10 11:41:35 +00:00
|
|
|
last[x] = 0x7fff;
|
|
|
|
}
|
2012-04-09 06:19:32 +00:00
|
|
|
}
|
2012-04-03 00:47:28 +00:00
|
|
|
|
2012-04-09 06:19:32 +00:00
|
|
|
void PPU::render_screen() {
|
Update to v088r03 release.
byuu says:
static vector<uint8_t> file::read(const string &filename); replaces:
static bool file::read(const string &filename, uint8_t *&data, unsigned
&size); This allows automatic deletion of the underlying data.
Added vectorstream, which is obviously a vector<uint8_t> wrapper for
a data stream. Plan is for all data accesses inside my emulation cores
to take stream objects, especially MSU1. This lets you feed the core
anything: memorystream, filestream, zipstream, gzipstream, httpstream,
etc. There will still be exceptions for link and serial, those need
actual library files on disk. But those aren't official hardware devices
anyway.
So to help with speed a bit, I'm rethinking the video rendering path.
Previous system:
- core outputs system-native samples (SNES = 19-bit LRGB, NES = 9-bit
emphasis+palette, DMG = 2-bit grayscale, etc.)
- interfaceSystem transforms samples to 30-bit via lookup table inside
the emulation core
- interfaceSystem masks off overscan areas, if enabled
- interfaceUI runs filter to produce new target buffer, if enabled
- interfaceUI transforms 30-bit video to native display depth (24-bit or
30-bit), and applies color-adjustments (gamma, etc) at the same time
New system:
- all cores now generate an internal palette, and call
Interface::videoColor(uint32_t source, uint16_t red, uint16_t green,
uint16_t blue) to get native display color post-adjusted (gamma, etc
applied already.)
- all cores output to uint32_t* buffer now (output video.palette[color]
instead of just color)
- interfaceUI runs filter to produce new target buffer, if enabled
- interfaceUI memcpy()'s buffer to the video card
videoColor() is pretty neat. source is the raw pixel (as per the
old-format, 19-bit SNES, 9-bit NES, etc), and you can create a color
from that if you really want to. Or return that value to get a buffer
just like v088 and below. red, green, blue are 16-bits per channel,
because why the hell not, right? Just lop off all the bits you don't
want. If you have more bits on your display than that, fuck you :P
The last step is extremely difficult to avoid. Video cards can and do
have pitches that differ from the width of the texture. Trying to make
the core account for this would be really awful. And even if we did
that, the emulation routine would need to write directly to a video card
RAM buffer. Some APIs require you to lock the video buffer while
writing, so this would leave the video buffer locked for a long time.
Probably not catastrophic, but still awful. And lastly, if the
emulation core tried writing directly to the display texture, software
filters would no longer be possible (unless you -really- jump through
hooks and divert to a memory buffer when a filter is enabled, but ...
fuck.)
Anyway, the point of all that work was to eliminate an extra video copy,
and the need for a really painful 30-bit to 24-bit conversion (three
shifts, three masks, three array indexes.) So this basically reverts us,
performance-wise, to where we were pre-30 bit support.
[...]
The downside to this is that we're going to need a filter for each
output depth. Since the array type is uint32_t*, and I don't intend to
support higher or lower depths, we really only need 24+30-bit versions
of each filter. Kinda shitty, but oh well.
2012-04-27 12:12:53 +00:00
|
|
|
uint32 *line = output + regs.vcounter * 240;
|
2012-04-10 11:41:35 +00:00
|
|
|
uint16 *last = blur + regs.vcounter * 240;
|
|
|
|
|
Update to v087r28 release.
byuu says:
Be sure to run make install, and move required images to their appropriate system profile folders.
I still have no warnings in place if those images aren't present.
Changelog:
- OBJ mosaic should hopefully be emulated correctly now (thanks to krom
and Cydrak for testing the hardware behavior)
- emulated dummy serial registers, fixes Sonic Advance (you may still
need to specify 512KB FlashROM with an appropriate ID, I used
Panaonic's)
- GBA core exits scheduler (PPU thread) and calls
interface->videoRefresh() from main thread (not required, just nice)
- SRAM, FRAM, EEPROM and FlashROM initialized to 0xFF if it does not
exist (probably not needed, but FlashROM likes to reset to 0xFF
anyway)
- GBA manifest.xml for file-mode will now use "gamename.xml" instead of
"gamename.gba.xml"
- started renaming "NES" to "Famicom" and "SNES" to "Super Famicom" in
the GUI (may or may not change source code in the long-term)
- removed target-libsnes/
- added profile/
Profiles are the major new feature. So far we have:
Famicom.sys/{nothing (yet?)}
Super Famicom.sys/{ipl.rom}
Game Boy.sys/{boot.rom}
Game Boy Color.sys/{boot.rom}
Game Boy Advance.sys/{bios.rom[not included]}
Super Game Boy.sfc/{boot.rom,program.rom[not included]}
BS-X Satellaview.sfc/{program.rom,bsx.ram,bsx.pram}
Sufami Turbo.sfc/{program.rom}
The SGB, BSX and ST cartridges ask you to load GB, BS or ST cartridges
directly now. No slot loader for them. So the obvious downsides: you
can't quickly pick between different SGB BIOSes, but why would you want
to? Just use SGB2/JP. It's still possible, so I'll sacrifice a little
complexity for a rare case to make it a lot easier for the more common
case. ST cartridges currently won't let you load the secondary slot.
BS-X Town cart is the only useful game to load with nothing in the slot,
but only barely, since games are all seeded on flash and not on PSRAM
images. We can revisit a way to boot the BIOS directly if and when we
get the satellite uplink emulated and data can be downloaded onto the
PSRAM :P BS-X slotted cartridges still require the secondary slot.
My plan for BS-X slotted cartridges is to require a manifest.xml to
specify that it has the BS-X slot present. Otherwise, we have to load
the ROM into the SNES cartridge class, and parse its header before we
can find out if it has one. Screw that. If it's in the XML, I can tell
before loading the ROM if I need to present you with an optional slot
loading dialog. I will probably do something similar for Sufami Turbo.
Not all games even work with a secondary slot, so why ask you to load
a second slot for them? Let the XML request a second slot. A complete
Sufami Turbo ROM set will be trivial anyway. Not sure how I want to do
the sub dialog yet. We want basic file loading, but we don't want it to
look like the dialog 'didn't do anything' if it pops back open
immediately again. Maybe change the background color of the dialog to
a darker gray? Tacky, but it'd give you the visual cue without the need
for some subtle text changes.
2012-04-18 13:58:04 +00:00
|
|
|
if(regs.bg[0].control.mosaic) render_mosaic_background(BG0);
|
|
|
|
if(regs.bg[1].control.mosaic) render_mosaic_background(BG1);
|
|
|
|
if(regs.bg[2].control.mosaic) render_mosaic_background(BG2);
|
|
|
|
if(regs.bg[3].control.mosaic) render_mosaic_background(BG3);
|
|
|
|
render_mosaic_object();
|
2012-04-14 07:26:45 +00:00
|
|
|
|
2012-04-03 00:47:28 +00:00
|
|
|
for(unsigned x = 0; x < 240; x++) {
|
Update to v087r22 release.
byuu says:
Changelog:
- fixed below pixel green channel on color blending
- added semi-transparent objects [Exophase's method]
- added full support for windows (both inputs, OBJ windows, and output, with optional color effect disable)
- EEPROM uses nall::bitarray now to be friendlier to saving memory to disk
- removed incomplete mosaic support for now (too broken, untested)
- improved sprite priority. Hopefully it's right now.
Just about everything should look great now. It took 25 days, but we
finally have the BIOS rendering correctly.
In order to do OBJ windows, I had to drop my above/below buffers
entirely. I went with the nuclear option. There's separate layers for
all BGs and objects. I build the OBJ window table during object
rendering. So as a result, after rendering I go back and apply windows
(and the object window that now exists.) After that, I have to do
a painful Z-buffer select of the top two most important pixels. Since
I now know the layers, the blending enable tests are a lot nicer, at
least. But this obviously has quite a speed hit: 390fps to 325fps for
Mr. Driller 2 title screen.
TONC says that "bad" window coordinates do really insane things. GBAtek
says it's a simple y2 < y1 || y2 > 160 ? 160 : y2; x2 < x1 || x2 > 240
? 240 : x2; I like the GBAtek version more, so I went with that. I sure
hope it's right ... but my guess is the hardware does this with
a counter that wraps around or something. Also, say you have two OBJ
mode 2 sprites that overlap each other, but with different priorities.
The lower (more important) priority sprite has a clear pixel, but the
higher priority sprite has a set pixel. Do we set the "inside OBJ
window" flag to true here? Eg does the value OR, or does it hold the
most important sprite's pixel value? Cydrak suspects it's OR-based,
I concur from what I can see.
Mosaic, I am at a loss. I really need a lot more information in order to
implement it. For backgrounds, does it apply to the Vcounter of the
entire screen? Or does it apply post-scroll? Or does it even apply after
every adjust in affine/bitmap modes? I'm betting the hcounter
background mosaic starts at the leftmost edge of the screen, and repeats
previous pixels to apply the effect. Like SNES, very simple. For
sprites, the SNES didn't have this. Does the mosaic grid start at (0,0)
of the screen, or at (0,0) of each sprite? The latter will look a lot
nicer, but be a lot more complex. Is mosaic on affine objects any
different than mosaic of linear(tiled) objects?
With that out of the way, we still have to fix the CPU memory access
timing, add the rest of the CPU penalty cycles, the memory rotation
/ alignment / extend behavior needs to be fixed, the shifter desperately
needs to be moved from loops to single shift operations, and I need to
add flash memory support.
2012-04-13 11:49:32 +00:00
|
|
|
Registers::WindowFlags flags;
|
|
|
|
flags = ~0; //enable all layers if no windows are enabled
|
2012-04-10 11:41:35 +00:00
|
|
|
|
Update to v087r22 release.
byuu says:
Changelog:
- fixed below pixel green channel on color blending
- added semi-transparent objects [Exophase's method]
- added full support for windows (both inputs, OBJ windows, and output, with optional color effect disable)
- EEPROM uses nall::bitarray now to be friendlier to saving memory to disk
- removed incomplete mosaic support for now (too broken, untested)
- improved sprite priority. Hopefully it's right now.
Just about everything should look great now. It took 25 days, but we
finally have the BIOS rendering correctly.
In order to do OBJ windows, I had to drop my above/below buffers
entirely. I went with the nuclear option. There's separate layers for
all BGs and objects. I build the OBJ window table during object
rendering. So as a result, after rendering I go back and apply windows
(and the object window that now exists.) After that, I have to do
a painful Z-buffer select of the top two most important pixels. Since
I now know the layers, the blending enable tests are a lot nicer, at
least. But this obviously has quite a speed hit: 390fps to 325fps for
Mr. Driller 2 title screen.
TONC says that "bad" window coordinates do really insane things. GBAtek
says it's a simple y2 < y1 || y2 > 160 ? 160 : y2; x2 < x1 || x2 > 240
? 240 : x2; I like the GBAtek version more, so I went with that. I sure
hope it's right ... but my guess is the hardware does this with
a counter that wraps around or something. Also, say you have two OBJ
mode 2 sprites that overlap each other, but with different priorities.
The lower (more important) priority sprite has a clear pixel, but the
higher priority sprite has a set pixel. Do we set the "inside OBJ
window" flag to true here? Eg does the value OR, or does it hold the
most important sprite's pixel value? Cydrak suspects it's OR-based,
I concur from what I can see.
Mosaic, I am at a loss. I really need a lot more information in order to
implement it. For backgrounds, does it apply to the Vcounter of the
entire screen? Or does it apply post-scroll? Or does it even apply after
every adjust in affine/bitmap modes? I'm betting the hcounter
background mosaic starts at the leftmost edge of the screen, and repeats
previous pixels to apply the effect. Like SNES, very simple. For
sprites, the SNES didn't have this. Does the mosaic grid start at (0,0)
of the screen, or at (0,0) of each sprite? The latter will look a lot
nicer, but be a lot more complex. Is mosaic on affine objects any
different than mosaic of linear(tiled) objects?
With that out of the way, we still have to fix the CPU memory access
timing, add the rest of the CPU penalty cycles, the memory rotation
/ alignment / extend behavior needs to be fixed, the shifter desperately
needs to be moved from loops to single shift operations, and I need to
add flash memory support.
2012-04-13 11:49:32 +00:00
|
|
|
//determine active window
|
|
|
|
if(regs.control.enablewindow[In0] || regs.control.enablewindow[In1] || regs.control.enablewindow[Obj]) {
|
|
|
|
flags = (uint8)regs.windowflags[Out];
|
|
|
|
if(regs.control.enablewindow[Obj] && windowmask[Obj][x]) flags = (uint8)regs.windowflags[Obj];
|
|
|
|
if(regs.control.enablewindow[In1] && windowmask[In1][x]) flags = (uint8)regs.windowflags[In1];
|
|
|
|
if(regs.control.enablewindow[In0] && windowmask[In0][x]) flags = (uint8)regs.windowflags[In0];
|
|
|
|
}
|
|
|
|
|
|
|
|
//priority sorting: find topmost two pixels
|
|
|
|
unsigned a = 5, b = 5;
|
|
|
|
for(signed p = 3; p >= 0; p--) {
|
|
|
|
for(signed l = 5; l >= 0; l--) {
|
|
|
|
if(layer[l][x].enable && layer[l][x].priority == p && flags.enable[l]) {
|
|
|
|
b = a;
|
|
|
|
a = l;
|
|
|
|
}
|
2012-04-10 11:41:35 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
Update to v087r22 release.
byuu says:
Changelog:
- fixed below pixel green channel on color blending
- added semi-transparent objects [Exophase's method]
- added full support for windows (both inputs, OBJ windows, and output, with optional color effect disable)
- EEPROM uses nall::bitarray now to be friendlier to saving memory to disk
- removed incomplete mosaic support for now (too broken, untested)
- improved sprite priority. Hopefully it's right now.
Just about everything should look great now. It took 25 days, but we
finally have the BIOS rendering correctly.
In order to do OBJ windows, I had to drop my above/below buffers
entirely. I went with the nuclear option. There's separate layers for
all BGs and objects. I build the OBJ window table during object
rendering. So as a result, after rendering I go back and apply windows
(and the object window that now exists.) After that, I have to do
a painful Z-buffer select of the top two most important pixels. Since
I now know the layers, the blending enable tests are a lot nicer, at
least. But this obviously has quite a speed hit: 390fps to 325fps for
Mr. Driller 2 title screen.
TONC says that "bad" window coordinates do really insane things. GBAtek
says it's a simple y2 < y1 || y2 > 160 ? 160 : y2; x2 < x1 || x2 > 240
? 240 : x2; I like the GBAtek version more, so I went with that. I sure
hope it's right ... but my guess is the hardware does this with
a counter that wraps around or something. Also, say you have two OBJ
mode 2 sprites that overlap each other, but with different priorities.
The lower (more important) priority sprite has a clear pixel, but the
higher priority sprite has a set pixel. Do we set the "inside OBJ
window" flag to true here? Eg does the value OR, or does it hold the
most important sprite's pixel value? Cydrak suspects it's OR-based,
I concur from what I can see.
Mosaic, I am at a loss. I really need a lot more information in order to
implement it. For backgrounds, does it apply to the Vcounter of the
entire screen? Or does it apply post-scroll? Or does it even apply after
every adjust in affine/bitmap modes? I'm betting the hcounter
background mosaic starts at the leftmost edge of the screen, and repeats
previous pixels to apply the effect. Like SNES, very simple. For
sprites, the SNES didn't have this. Does the mosaic grid start at (0,0)
of the screen, or at (0,0) of each sprite? The latter will look a lot
nicer, but be a lot more complex. Is mosaic on affine objects any
different than mosaic of linear(tiled) objects?
With that out of the way, we still have to fix the CPU memory access
timing, add the rest of the CPU penalty cycles, the memory rotation
/ alignment / extend behavior needs to be fixed, the shifter desperately
needs to be moved from loops to single shift operations, and I need to
add flash memory support.
2012-04-13 11:49:32 +00:00
|
|
|
auto &above = layer[a];
|
|
|
|
auto &below = layer[b];
|
|
|
|
bool blendabove = regs.blend.control.above[a];
|
|
|
|
bool blendbelow = regs.blend.control.below[b];
|
|
|
|
unsigned color = above[x].color;
|
|
|
|
|
|
|
|
//perform blending, if needed
|
|
|
|
if(flags.enable[SFX] == false) {
|
|
|
|
} else if(above[x].translucent && blendbelow) {
|
|
|
|
color = blend(above[x].color, regs.blend.eva, below[x].color, regs.blend.evb);
|
|
|
|
} else if(regs.blend.control.mode == 1 && blendabove && blendbelow) {
|
|
|
|
color = blend(above[x].color, regs.blend.eva, below[x].color, regs.blend.evb);
|
|
|
|
} else if(regs.blend.control.mode == 2 && blendabove) {
|
|
|
|
color = blend(above[x].color, 16 - regs.blend.evy, 0x7fff, regs.blend.evy);
|
|
|
|
} else if(regs.blend.control.mode == 3 && blendabove) {
|
|
|
|
color = blend(above[x].color, 16 - regs.blend.evy, 0x0000, regs.blend.evy);
|
|
|
|
}
|
|
|
|
|
|
|
|
//output pixel; blend with previous pixel to simulate GBA LCD blur
|
Update to v088r03 release.
byuu says:
static vector<uint8_t> file::read(const string &filename); replaces:
static bool file::read(const string &filename, uint8_t *&data, unsigned
&size); This allows automatic deletion of the underlying data.
Added vectorstream, which is obviously a vector<uint8_t> wrapper for
a data stream. Plan is for all data accesses inside my emulation cores
to take stream objects, especially MSU1. This lets you feed the core
anything: memorystream, filestream, zipstream, gzipstream, httpstream,
etc. There will still be exceptions for link and serial, those need
actual library files on disk. But those aren't official hardware devices
anyway.
So to help with speed a bit, I'm rethinking the video rendering path.
Previous system:
- core outputs system-native samples (SNES = 19-bit LRGB, NES = 9-bit
emphasis+palette, DMG = 2-bit grayscale, etc.)
- interfaceSystem transforms samples to 30-bit via lookup table inside
the emulation core
- interfaceSystem masks off overscan areas, if enabled
- interfaceUI runs filter to produce new target buffer, if enabled
- interfaceUI transforms 30-bit video to native display depth (24-bit or
30-bit), and applies color-adjustments (gamma, etc) at the same time
New system:
- all cores now generate an internal palette, and call
Interface::videoColor(uint32_t source, uint16_t red, uint16_t green,
uint16_t blue) to get native display color post-adjusted (gamma, etc
applied already.)
- all cores output to uint32_t* buffer now (output video.palette[color]
instead of just color)
- interfaceUI runs filter to produce new target buffer, if enabled
- interfaceUI memcpy()'s buffer to the video card
videoColor() is pretty neat. source is the raw pixel (as per the
old-format, 19-bit SNES, 9-bit NES, etc), and you can create a color
from that if you really want to. Or return that value to get a buffer
just like v088 and below. red, green, blue are 16-bits per channel,
because why the hell not, right? Just lop off all the bits you don't
want. If you have more bits on your display than that, fuck you :P
The last step is extremely difficult to avoid. Video cards can and do
have pitches that differ from the width of the texture. Trying to make
the core account for this would be really awful. And even if we did
that, the emulation routine would need to write directly to a video card
RAM buffer. Some APIs require you to lock the video buffer while
writing, so this would leave the video buffer locked for a long time.
Probably not catastrophic, but still awful. And lastly, if the
emulation core tried writing directly to the display texture, software
filters would no longer be possible (unless you -really- jump through
hooks and divert to a memory buffer when a filter is enabled, but ...
fuck.)
Anyway, the point of all that work was to eliminate an extra video copy,
and the need for a really painful 30-bit to 24-bit conversion (three
shifts, three masks, three array indexes.) So this basically reverts us,
performance-wise, to where we were pre-30 bit support.
[...]
The downside to this is that we're going to need a filter for each
output depth. Since the array type is uint32_t*, and I don't intend to
support higher or lower depths, we really only need 24+30-bit versions
of each filter. Kinda shitty, but oh well.
2012-04-27 12:12:53 +00:00
|
|
|
line[x] = video.palette[(color + last[x] - ((color ^ last[x]) & 0x0421)) >> 1];
|
2012-04-10 11:41:35 +00:00
|
|
|
last[x] = color;
|
2012-04-03 00:47:28 +00:00
|
|
|
}
|
|
|
|
}
|
2012-04-10 11:41:35 +00:00
|
|
|
|
Update to v087r22 release.
byuu says:
Changelog:
- fixed below pixel green channel on color blending
- added semi-transparent objects [Exophase's method]
- added full support for windows (both inputs, OBJ windows, and output, with optional color effect disable)
- EEPROM uses nall::bitarray now to be friendlier to saving memory to disk
- removed incomplete mosaic support for now (too broken, untested)
- improved sprite priority. Hopefully it's right now.
Just about everything should look great now. It took 25 days, but we
finally have the BIOS rendering correctly.
In order to do OBJ windows, I had to drop my above/below buffers
entirely. I went with the nuclear option. There's separate layers for
all BGs and objects. I build the OBJ window table during object
rendering. So as a result, after rendering I go back and apply windows
(and the object window that now exists.) After that, I have to do
a painful Z-buffer select of the top two most important pixels. Since
I now know the layers, the blending enable tests are a lot nicer, at
least. But this obviously has quite a speed hit: 390fps to 325fps for
Mr. Driller 2 title screen.
TONC says that "bad" window coordinates do really insane things. GBAtek
says it's a simple y2 < y1 || y2 > 160 ? 160 : y2; x2 < x1 || x2 > 240
? 240 : x2; I like the GBAtek version more, so I went with that. I sure
hope it's right ... but my guess is the hardware does this with
a counter that wraps around or something. Also, say you have two OBJ
mode 2 sprites that overlap each other, but with different priorities.
The lower (more important) priority sprite has a clear pixel, but the
higher priority sprite has a set pixel. Do we set the "inside OBJ
window" flag to true here? Eg does the value OR, or does it hold the
most important sprite's pixel value? Cydrak suspects it's OR-based,
I concur from what I can see.
Mosaic, I am at a loss. I really need a lot more information in order to
implement it. For backgrounds, does it apply to the Vcounter of the
entire screen? Or does it apply post-scroll? Or does it even apply after
every adjust in affine/bitmap modes? I'm betting the hcounter
background mosaic starts at the leftmost edge of the screen, and repeats
previous pixels to apply the effect. Like SNES, very simple. For
sprites, the SNES didn't have this. Does the mosaic grid start at (0,0)
of the screen, or at (0,0) of each sprite? The latter will look a lot
nicer, but be a lot more complex. Is mosaic on affine objects any
different than mosaic of linear(tiled) objects?
With that out of the way, we still have to fix the CPU memory access
timing, add the rest of the CPU penalty cycles, the memory rotation
/ alignment / extend behavior needs to be fixed, the shifter desperately
needs to be moved from loops to single shift operations, and I need to
add flash memory support.
2012-04-13 11:49:32 +00:00
|
|
|
void PPU::render_window(unsigned w) {
|
|
|
|
unsigned y = regs.vcounter;
|
2012-04-10 11:41:35 +00:00
|
|
|
|
Update to v087r22 release.
byuu says:
Changelog:
- fixed below pixel green channel on color blending
- added semi-transparent objects [Exophase's method]
- added full support for windows (both inputs, OBJ windows, and output, with optional color effect disable)
- EEPROM uses nall::bitarray now to be friendlier to saving memory to disk
- removed incomplete mosaic support for now (too broken, untested)
- improved sprite priority. Hopefully it's right now.
Just about everything should look great now. It took 25 days, but we
finally have the BIOS rendering correctly.
In order to do OBJ windows, I had to drop my above/below buffers
entirely. I went with the nuclear option. There's separate layers for
all BGs and objects. I build the OBJ window table during object
rendering. So as a result, after rendering I go back and apply windows
(and the object window that now exists.) After that, I have to do
a painful Z-buffer select of the top two most important pixels. Since
I now know the layers, the blending enable tests are a lot nicer, at
least. But this obviously has quite a speed hit: 390fps to 325fps for
Mr. Driller 2 title screen.
TONC says that "bad" window coordinates do really insane things. GBAtek
says it's a simple y2 < y1 || y2 > 160 ? 160 : y2; x2 < x1 || x2 > 240
? 240 : x2; I like the GBAtek version more, so I went with that. I sure
hope it's right ... but my guess is the hardware does this with
a counter that wraps around or something. Also, say you have two OBJ
mode 2 sprites that overlap each other, but with different priorities.
The lower (more important) priority sprite has a clear pixel, but the
higher priority sprite has a set pixel. Do we set the "inside OBJ
window" flag to true here? Eg does the value OR, or does it hold the
most important sprite's pixel value? Cydrak suspects it's OR-based,
I concur from what I can see.
Mosaic, I am at a loss. I really need a lot more information in order to
implement it. For backgrounds, does it apply to the Vcounter of the
entire screen? Or does it apply post-scroll? Or does it even apply after
every adjust in affine/bitmap modes? I'm betting the hcounter
background mosaic starts at the leftmost edge of the screen, and repeats
previous pixels to apply the effect. Like SNES, very simple. For
sprites, the SNES didn't have this. Does the mosaic grid start at (0,0)
of the screen, or at (0,0) of each sprite? The latter will look a lot
nicer, but be a lot more complex. Is mosaic on affine objects any
different than mosaic of linear(tiled) objects?
With that out of the way, we still have to fix the CPU memory access
timing, add the rest of the CPU penalty cycles, the memory rotation
/ alignment / extend behavior needs to be fixed, the shifter desperately
needs to be moved from loops to single shift operations, and I need to
add flash memory support.
2012-04-13 11:49:32 +00:00
|
|
|
unsigned y1 = regs.window[w].y1, y2 = regs.window[w].y2;
|
|
|
|
unsigned x1 = regs.window[w].x1, x2 = regs.window[w].x2;
|
2012-04-10 11:41:35 +00:00
|
|
|
|
Update to v087r22 release.
byuu says:
Changelog:
- fixed below pixel green channel on color blending
- added semi-transparent objects [Exophase's method]
- added full support for windows (both inputs, OBJ windows, and output, with optional color effect disable)
- EEPROM uses nall::bitarray now to be friendlier to saving memory to disk
- removed incomplete mosaic support for now (too broken, untested)
- improved sprite priority. Hopefully it's right now.
Just about everything should look great now. It took 25 days, but we
finally have the BIOS rendering correctly.
In order to do OBJ windows, I had to drop my above/below buffers
entirely. I went with the nuclear option. There's separate layers for
all BGs and objects. I build the OBJ window table during object
rendering. So as a result, after rendering I go back and apply windows
(and the object window that now exists.) After that, I have to do
a painful Z-buffer select of the top two most important pixels. Since
I now know the layers, the blending enable tests are a lot nicer, at
least. But this obviously has quite a speed hit: 390fps to 325fps for
Mr. Driller 2 title screen.
TONC says that "bad" window coordinates do really insane things. GBAtek
says it's a simple y2 < y1 || y2 > 160 ? 160 : y2; x2 < x1 || x2 > 240
? 240 : x2; I like the GBAtek version more, so I went with that. I sure
hope it's right ... but my guess is the hardware does this with
a counter that wraps around or something. Also, say you have two OBJ
mode 2 sprites that overlap each other, but with different priorities.
The lower (more important) priority sprite has a clear pixel, but the
higher priority sprite has a set pixel. Do we set the "inside OBJ
window" flag to true here? Eg does the value OR, or does it hold the
most important sprite's pixel value? Cydrak suspects it's OR-based,
I concur from what I can see.
Mosaic, I am at a loss. I really need a lot more information in order to
implement it. For backgrounds, does it apply to the Vcounter of the
entire screen? Or does it apply post-scroll? Or does it even apply after
every adjust in affine/bitmap modes? I'm betting the hcounter
background mosaic starts at the leftmost edge of the screen, and repeats
previous pixels to apply the effect. Like SNES, very simple. For
sprites, the SNES didn't have this. Does the mosaic grid start at (0,0)
of the screen, or at (0,0) of each sprite? The latter will look a lot
nicer, but be a lot more complex. Is mosaic on affine objects any
different than mosaic of linear(tiled) objects?
With that out of the way, we still have to fix the CPU memory access
timing, add the rest of the CPU penalty cycles, the memory rotation
/ alignment / extend behavior needs to be fixed, the shifter desperately
needs to be moved from loops to single shift operations, and I need to
add flash memory support.
2012-04-13 11:49:32 +00:00
|
|
|
if(y2 < y1 || y2 > 160) y2 = 160;
|
|
|
|
if(x2 < x1 || x2 > 240) x2 = 240;
|
|
|
|
|
|
|
|
if(y >= y1 && y < y2) {
|
|
|
|
for(unsigned x = x1; x < x2; x++) {
|
|
|
|
windowmask[w][x] = true;
|
|
|
|
}
|
2012-04-10 11:41:35 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
unsigned PPU::blend(unsigned above, unsigned eva, unsigned below, unsigned evb) {
|
|
|
|
uint5 ar = above >> 0, ag = above >> 5, ab = above >> 10;
|
Update to v087r22 release.
byuu says:
Changelog:
- fixed below pixel green channel on color blending
- added semi-transparent objects [Exophase's method]
- added full support for windows (both inputs, OBJ windows, and output, with optional color effect disable)
- EEPROM uses nall::bitarray now to be friendlier to saving memory to disk
- removed incomplete mosaic support for now (too broken, untested)
- improved sprite priority. Hopefully it's right now.
Just about everything should look great now. It took 25 days, but we
finally have the BIOS rendering correctly.
In order to do OBJ windows, I had to drop my above/below buffers
entirely. I went with the nuclear option. There's separate layers for
all BGs and objects. I build the OBJ window table during object
rendering. So as a result, after rendering I go back and apply windows
(and the object window that now exists.) After that, I have to do
a painful Z-buffer select of the top two most important pixels. Since
I now know the layers, the blending enable tests are a lot nicer, at
least. But this obviously has quite a speed hit: 390fps to 325fps for
Mr. Driller 2 title screen.
TONC says that "bad" window coordinates do really insane things. GBAtek
says it's a simple y2 < y1 || y2 > 160 ? 160 : y2; x2 < x1 || x2 > 240
? 240 : x2; I like the GBAtek version more, so I went with that. I sure
hope it's right ... but my guess is the hardware does this with
a counter that wraps around or something. Also, say you have two OBJ
mode 2 sprites that overlap each other, but with different priorities.
The lower (more important) priority sprite has a clear pixel, but the
higher priority sprite has a set pixel. Do we set the "inside OBJ
window" flag to true here? Eg does the value OR, or does it hold the
most important sprite's pixel value? Cydrak suspects it's OR-based,
I concur from what I can see.
Mosaic, I am at a loss. I really need a lot more information in order to
implement it. For backgrounds, does it apply to the Vcounter of the
entire screen? Or does it apply post-scroll? Or does it even apply after
every adjust in affine/bitmap modes? I'm betting the hcounter
background mosaic starts at the leftmost edge of the screen, and repeats
previous pixels to apply the effect. Like SNES, very simple. For
sprites, the SNES didn't have this. Does the mosaic grid start at (0,0)
of the screen, or at (0,0) of each sprite? The latter will look a lot
nicer, but be a lot more complex. Is mosaic on affine objects any
different than mosaic of linear(tiled) objects?
With that out of the way, we still have to fix the CPU memory access
timing, add the rest of the CPU penalty cycles, the memory rotation
/ alignment / extend behavior needs to be fixed, the shifter desperately
needs to be moved from loops to single shift operations, and I need to
add flash memory support.
2012-04-13 11:49:32 +00:00
|
|
|
uint5 br = below >> 0, bg = below >> 5, bb = below >> 10;
|
2012-04-10 11:41:35 +00:00
|
|
|
|
Update to v087r22 release.
byuu says:
Changelog:
- fixed below pixel green channel on color blending
- added semi-transparent objects [Exophase's method]
- added full support for windows (both inputs, OBJ windows, and output, with optional color effect disable)
- EEPROM uses nall::bitarray now to be friendlier to saving memory to disk
- removed incomplete mosaic support for now (too broken, untested)
- improved sprite priority. Hopefully it's right now.
Just about everything should look great now. It took 25 days, but we
finally have the BIOS rendering correctly.
In order to do OBJ windows, I had to drop my above/below buffers
entirely. I went with the nuclear option. There's separate layers for
all BGs and objects. I build the OBJ window table during object
rendering. So as a result, after rendering I go back and apply windows
(and the object window that now exists.) After that, I have to do
a painful Z-buffer select of the top two most important pixels. Since
I now know the layers, the blending enable tests are a lot nicer, at
least. But this obviously has quite a speed hit: 390fps to 325fps for
Mr. Driller 2 title screen.
TONC says that "bad" window coordinates do really insane things. GBAtek
says it's a simple y2 < y1 || y2 > 160 ? 160 : y2; x2 < x1 || x2 > 240
? 240 : x2; I like the GBAtek version more, so I went with that. I sure
hope it's right ... but my guess is the hardware does this with
a counter that wraps around or something. Also, say you have two OBJ
mode 2 sprites that overlap each other, but with different priorities.
The lower (more important) priority sprite has a clear pixel, but the
higher priority sprite has a set pixel. Do we set the "inside OBJ
window" flag to true here? Eg does the value OR, or does it hold the
most important sprite's pixel value? Cydrak suspects it's OR-based,
I concur from what I can see.
Mosaic, I am at a loss. I really need a lot more information in order to
implement it. For backgrounds, does it apply to the Vcounter of the
entire screen? Or does it apply post-scroll? Or does it even apply after
every adjust in affine/bitmap modes? I'm betting the hcounter
background mosaic starts at the leftmost edge of the screen, and repeats
previous pixels to apply the effect. Like SNES, very simple. For
sprites, the SNES didn't have this. Does the mosaic grid start at (0,0)
of the screen, or at (0,0) of each sprite? The latter will look a lot
nicer, but be a lot more complex. Is mosaic on affine objects any
different than mosaic of linear(tiled) objects?
With that out of the way, we still have to fix the CPU memory access
timing, add the rest of the CPU penalty cycles, the memory rotation
/ alignment / extend behavior needs to be fixed, the shifter desperately
needs to be moved from loops to single shift operations, and I need to
add flash memory support.
2012-04-13 11:49:32 +00:00
|
|
|
unsigned r = (ar * eva + br * evb) >> 4;
|
|
|
|
unsigned g = (ag * eva + bg * evb) >> 4;
|
|
|
|
unsigned b = (ab * eva + bb * evb) >> 4;
|
2012-04-10 11:41:35 +00:00
|
|
|
|
|
|
|
return min(31, r) << 0 | min(31, g) << 5 | min(31, b) << 10;
|
|
|
|
}
|