Update to v068r10 release.
(there was no r09 release posted to the WIP thread)
byuu says:
It is feature-complete, but horizontal mosaic is less accurate. I have
an idea for a mosaic color ring buffer to get it equally accurate, but
I haven't implemented it yet. For now it's just a simple x & ~(mosaic >>
1) trick that is passable.
Hires blending was left out, as it's more processor intensive and
blargg's NTSC does a better job with that anyway.
There's some OPT vertical positioning issues in the SNES Test Program's
character test; Goodbye, Anthrox has some sort of fast CPU DMA issue;
etc.
Total speedup is a mere 13.5%. Not quite the 50% I wanted in the best
case, but I'll take what I can get.
254->289fps in Zelda 3 on my E8400 now. There's another 15% hiding with
blargg's SMP and 5-10% with blargg's fast DSP, but they lose too much
accuracy. It'd put me at or below Snes9X accuracy, while still being 50%
slower.
SSE2 was performing worse this time, both on x86 and amd64, so I left
that optimization off.
So, barring a miracle, this is about the best it's going to get.
2010-09-03 11:37:36 +00:00
|
|
|
#include "mode7.cpp"
|
2010-09-01 13:20:05 +00:00
|
|
|
|
2015-12-06 21:11:41 +00:00
|
|
|
PPU::Background::Background(PPU& self, uint id) : self(self), id(id) {
|
|
|
|
priority0_enable = true;
|
|
|
|
priority1_enable = true;
|
|
|
|
|
|
|
|
opt_valid_bit = (id == ID::BG1 ? 0x2000 : id == ID::BG2 ? 0x4000 : 0x0000);
|
|
|
|
|
|
|
|
mosaic_table = new uint16*[16];
|
|
|
|
for(uint m = 0; m < 16; m++) {
|
|
|
|
mosaic_table[m] = new uint16[4096];
|
|
|
|
for(uint x = 0; x < 4096; x++) {
|
|
|
|
mosaic_table[m][x] = (x / (m + 1)) * (m + 1);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
PPU::Background::~Background() {
|
|
|
|
for(uint m = 0; m < 16; m++) delete[] mosaic_table[m];
|
|
|
|
delete[] mosaic_table;
|
|
|
|
}
|
|
|
|
|
|
|
|
auto PPU::Background::get_tile(uint hoffset, uint voffset) -> uint {
|
|
|
|
uint tile_x = (hoffset & mask_x) >> tile_width;
|
|
|
|
uint tile_y = (voffset & mask_y) >> tile_height;
|
2010-09-01 13:20:05 +00:00
|
|
|
|
2015-12-06 21:11:41 +00:00
|
|
|
uint tile_pos = ((tile_y & 0x1f) << 5) + (tile_x & 0x1f);
|
Update to v068r10 release.
(there was no r09 release posted to the WIP thread)
byuu says:
It is feature-complete, but horizontal mosaic is less accurate. I have
an idea for a mosaic color ring buffer to get it equally accurate, but
I haven't implemented it yet. For now it's just a simple x & ~(mosaic >>
1) trick that is passable.
Hires blending was left out, as it's more processor intensive and
blargg's NTSC does a better job with that anyway.
There's some OPT vertical positioning issues in the SNES Test Program's
character test; Goodbye, Anthrox has some sort of fast CPU DMA issue;
etc.
Total speedup is a mere 13.5%. Not quite the 50% I wanted in the best
case, but I'll take what I can get.
254->289fps in Zelda 3 on my E8400 now. There's another 15% hiding with
blargg's SMP and 5-10% with blargg's fast DSP, but they lose too much
accuracy. It'd put me at or below Snes9X accuracy, while still being 50%
slower.
SSE2 was performing worse this time, both on x86 and amd64, so I left
that optimization off.
So, barring a miracle, this is about the best it's going to get.
2010-09-03 11:37:36 +00:00
|
|
|
if(tile_y & 0x20) tile_pos += scy;
|
|
|
|
if(tile_x & 0x20) tile_pos += scx;
|
|
|
|
|
Update to v068r13 release.
byuu says:
Bug-fix night for the new PPUs.
Accuracy:
Fixed BG palette clamping, which fixes Taz-Mania.
Added blocking for CGRAM writes during active display, to match the
compatibility core. It really should override to the last fetched
palette color, I'll probably try that out soon enough.
Performance:
Mosaic should match the other renderers. Unfortunately, as suspected, it
murders speed. 290->275fps. It's now only 11fps faster, hardly worth it
at all. But the old rendering code is really awful, so maybe it's for
the best it gets refreshed.
It's really tough to understand why this is such a performance hit, it's
just a decrement+compare check four times per pixel. But yeah, it hits
it really, really hard.
Fixed a missing check in Mode4 offset-per-tile, fixes vertical alignment
of a test image in the SNES Test Program.
2010-09-05 13:22:26 +00:00
|
|
|
const uint16 tiledata_addr = regs.screen_addr + (tile_pos << 1);
|
Update to v074r11 release.
byuu says:
Changelog:
- debugger compiles on all three profiles
- libsnes compiles on all three platforms (no API changes to libsnes)
- memory.cpp : namespace memory removed (wram -> cpu, apuram -> smp,
vram, oam, cgram -> ppu)
- sa1.cpp : namespace memory removed (SA-1 specific functions merged
inline to SA1::bus_read,write)
- GameBoy: added serial link support with interrupts and proper 8192hz
timing, but obviously it acts as if no other GB is connected to it
- GameBoy: added STAT OAM interrupt, and better STAT d1,d0 mode values
- UI: since Qt is dead, I've renamed the config files back to bsnes.cfg
and bsnes-geometry.cfg
- SA1: IRAM was not syncing to CPU on SA-1 side
- PPU/Accuracy and PPU/Performance needed Sprite oam renamed to Sprite
sprite; so that I could add uint8 oam[544]
- makes more sense anyway, OAM = object attribute memory, obj or
sprite are better names for Sprite rendering class
- more cleanup
2011-01-24 09:03:17 +00:00
|
|
|
return (ppu.vram[tiledata_addr + 0] << 0) + (ppu.vram[tiledata_addr + 1] << 8);
|
Update to v068r10 release.
(there was no r09 release posted to the WIP thread)
byuu says:
It is feature-complete, but horizontal mosaic is less accurate. I have
an idea for a mosaic color ring buffer to get it equally accurate, but
I haven't implemented it yet. For now it's just a simple x & ~(mosaic >>
1) trick that is passable.
Hires blending was left out, as it's more processor intensive and
blargg's NTSC does a better job with that anyway.
There's some OPT vertical positioning issues in the SNES Test Program's
character test; Goodbye, Anthrox has some sort of fast CPU DMA issue;
etc.
Total speedup is a mere 13.5%. Not quite the 50% I wanted in the best
case, but I'll take what I can get.
254->289fps in Zelda 3 on my E8400 now. There's another 15% hiding with
blargg's SMP and 5-10% with blargg's fast DSP, but they lose too much
accuracy. It'd put me at or below Snes9X accuracy, while still being 50%
slower.
SSE2 was performing worse this time, both on x86 and amd64, so I left
that optimization off.
So, barring a miracle, this is about the best it's going to get.
2010-09-03 11:37:36 +00:00
|
|
|
}
|
2010-09-01 13:20:05 +00:00
|
|
|
|
2015-12-06 21:11:41 +00:00
|
|
|
auto PPU::Background::offset_per_tile(uint x, uint y, uint& hoffset, uint& voffset) -> void {
|
|
|
|
uint opt_x = (x + (hscroll & 7)), hval, vval;
|
Update to v068r10 release.
(there was no r09 release posted to the WIP thread)
byuu says:
It is feature-complete, but horizontal mosaic is less accurate. I have
an idea for a mosaic color ring buffer to get it equally accurate, but
I haven't implemented it yet. For now it's just a simple x & ~(mosaic >>
1) trick that is passable.
Hires blending was left out, as it's more processor intensive and
blargg's NTSC does a better job with that anyway.
There's some OPT vertical positioning issues in the SNES Test Program's
character test; Goodbye, Anthrox has some sort of fast CPU DMA issue;
etc.
Total speedup is a mere 13.5%. Not quite the 50% I wanted in the best
case, but I'll take what I can get.
254->289fps in Zelda 3 on my E8400 now. There's another 15% hiding with
blargg's SMP and 5-10% with blargg's fast DSP, but they lose too much
accuracy. It'd put me at or below Snes9X accuracy, while still being 50%
slower.
SSE2 was performing worse this time, both on x86 and amd64, so I left
that optimization off.
So, barring a miracle, this is about the best it's going to get.
2010-09-03 11:37:36 +00:00
|
|
|
if(opt_x >= 8) {
|
|
|
|
hval = self.bg3.get_tile((opt_x - 8) + (self.bg3.regs.hoffset & ~7), self.bg3.regs.voffset + 0);
|
|
|
|
if(self.regs.bgmode != 4)
|
|
|
|
vval = self.bg3.get_tile((opt_x - 8) + (self.bg3.regs.hoffset & ~7), self.bg3.regs.voffset + 8);
|
|
|
|
|
|
|
|
if(self.regs.bgmode == 4) {
|
|
|
|
if(hval & opt_valid_bit) {
|
Update to v068r13 release.
byuu says:
Bug-fix night for the new PPUs.
Accuracy:
Fixed BG palette clamping, which fixes Taz-Mania.
Added blocking for CGRAM writes during active display, to match the
compatibility core. It really should override to the last fetched
palette color, I'll probably try that out soon enough.
Performance:
Mosaic should match the other renderers. Unfortunately, as suspected, it
murders speed. 290->275fps. It's now only 11fps faster, hardly worth it
at all. But the old rendering code is really awful, so maybe it's for
the best it gets refreshed.
It's really tough to understand why this is such a performance hit, it's
just a decrement+compare check four times per pixel. But yeah, it hits
it really, really hard.
Fixed a missing check in Mode4 offset-per-tile, fixes vertical alignment
of a test image in the SNES Test Program.
2010-09-05 13:22:26 +00:00
|
|
|
if(!(hval & 0x8000)) {
|
|
|
|
hoffset = opt_x + (hval & ~7);
|
|
|
|
} else {
|
|
|
|
voffset = y + hval;
|
|
|
|
}
|
Update to v068r10 release.
(there was no r09 release posted to the WIP thread)
byuu says:
It is feature-complete, but horizontal mosaic is less accurate. I have
an idea for a mosaic color ring buffer to get it equally accurate, but
I haven't implemented it yet. For now it's just a simple x & ~(mosaic >>
1) trick that is passable.
Hires blending was left out, as it's more processor intensive and
blargg's NTSC does a better job with that anyway.
There's some OPT vertical positioning issues in the SNES Test Program's
character test; Goodbye, Anthrox has some sort of fast CPU DMA issue;
etc.
Total speedup is a mere 13.5%. Not quite the 50% I wanted in the best
case, but I'll take what I can get.
254->289fps in Zelda 3 on my E8400 now. There's another 15% hiding with
blargg's SMP and 5-10% with blargg's fast DSP, but they lose too much
accuracy. It'd put me at or below Snes9X accuracy, while still being 50%
slower.
SSE2 was performing worse this time, both on x86 and amd64, so I left
that optimization off.
So, barring a miracle, this is about the best it's going to get.
2010-09-03 11:37:36 +00:00
|
|
|
}
|
|
|
|
} else {
|
|
|
|
if(hval & opt_valid_bit) {
|
|
|
|
hoffset = opt_x + (hval & ~7);
|
|
|
|
}
|
|
|
|
if(vval & opt_valid_bit) {
|
|
|
|
voffset = y + vval;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-12-06 21:11:41 +00:00
|
|
|
auto PPU::Background::scanline() -> void {
|
Update to v068r13 release.
byuu says:
Bug-fix night for the new PPUs.
Accuracy:
Fixed BG palette clamping, which fixes Taz-Mania.
Added blocking for CGRAM writes during active display, to match the
compatibility core. It really should override to the last fetched
palette color, I'll probably try that out soon enough.
Performance:
Mosaic should match the other renderers. Unfortunately, as suspected, it
murders speed. 290->275fps. It's now only 11fps faster, hardly worth it
at all. But the old rendering code is really awful, so maybe it's for
the best it gets refreshed.
It's really tough to understand why this is such a performance hit, it's
just a decrement+compare check four times per pixel. But yeah, it hits
it really, really hard.
Fixed a missing check in Mode4 offset-per-tile, fixes vertical alignment
of a test image in the SNES Test Program.
2010-09-05 13:22:26 +00:00
|
|
|
if(self.vcounter() == 1) {
|
|
|
|
mosaic_vcounter = regs.mosaic + 1;
|
2010-10-11 10:36:00 +00:00
|
|
|
mosaic_voffset = 1;
|
Update to v068r13 release.
byuu says:
Bug-fix night for the new PPUs.
Accuracy:
Fixed BG palette clamping, which fixes Taz-Mania.
Added blocking for CGRAM writes during active display, to match the
compatibility core. It really should override to the last fetched
palette color, I'll probably try that out soon enough.
Performance:
Mosaic should match the other renderers. Unfortunately, as suspected, it
murders speed. 290->275fps. It's now only 11fps faster, hardly worth it
at all. But the old rendering code is really awful, so maybe it's for
the best it gets refreshed.
It's really tough to understand why this is such a performance hit, it's
just a decrement+compare check four times per pixel. But yeah, it hits
it really, really hard.
Fixed a missing check in Mode4 offset-per-tile, fixes vertical alignment
of a test image in the SNES Test Program.
2010-09-05 13:22:26 +00:00
|
|
|
} else if(--mosaic_vcounter == 0) {
|
|
|
|
mosaic_vcounter = regs.mosaic + 1;
|
2010-10-11 10:36:00 +00:00
|
|
|
mosaic_voffset += regs.mosaic + 1;
|
Update to v068r13 release.
byuu says:
Bug-fix night for the new PPUs.
Accuracy:
Fixed BG palette clamping, which fixes Taz-Mania.
Added blocking for CGRAM writes during active display, to match the
compatibility core. It really should override to the last fetched
palette color, I'll probably try that out soon enough.
Performance:
Mosaic should match the other renderers. Unfortunately, as suspected, it
murders speed. 290->275fps. It's now only 11fps faster, hardly worth it
at all. But the old rendering code is really awful, so maybe it's for
the best it gets refreshed.
It's really tough to understand why this is such a performance hit, it's
just a decrement+compare check four times per pixel. But yeah, it hits
it really, really hard.
Fixed a missing check in Mode4 offset-per-tile, fixes vertical alignment
of a test image in the SNES Test Program.
2010-09-05 13:22:26 +00:00
|
|
|
}
|
|
|
|
if(self.regs.display_disable) return;
|
|
|
|
|
Update to v068r10 release.
(there was no r09 release posted to the WIP thread)
byuu says:
It is feature-complete, but horizontal mosaic is less accurate. I have
an idea for a mosaic color ring buffer to get it equally accurate, but
I haven't implemented it yet. For now it's just a simple x & ~(mosaic >>
1) trick that is passable.
Hires blending was left out, as it's more processor intensive and
blargg's NTSC does a better job with that anyway.
There's some OPT vertical positioning issues in the SNES Test Program's
character test; Goodbye, Anthrox has some sort of fast CPU DMA issue;
etc.
Total speedup is a mere 13.5%. Not quite the 50% I wanted in the best
case, but I'll take what I can get.
254->289fps in Zelda 3 on my E8400 now. There's another 15% hiding with
blargg's SMP and 5-10% with blargg's fast DSP, but they lose too much
accuracy. It'd put me at or below Snes9X accuracy, while still being 50%
slower.
SSE2 was performing worse this time, both on x86 and amd64, so I left
that optimization off.
So, barring a miracle, this is about the best it's going to get.
2010-09-03 11:37:36 +00:00
|
|
|
hires = (self.regs.bgmode == 5 || self.regs.bgmode == 6);
|
|
|
|
width = !hires ? 256 : 512;
|
2010-09-01 13:20:05 +00:00
|
|
|
|
Update to v068r10 release.
(there was no r09 release posted to the WIP thread)
byuu says:
It is feature-complete, but horizontal mosaic is less accurate. I have
an idea for a mosaic color ring buffer to get it equally accurate, but
I haven't implemented it yet. For now it's just a simple x & ~(mosaic >>
1) trick that is passable.
Hires blending was left out, as it's more processor intensive and
blargg's NTSC does a better job with that anyway.
There's some OPT vertical positioning issues in the SNES Test Program's
character test; Goodbye, Anthrox has some sort of fast CPU DMA issue;
etc.
Total speedup is a mere 13.5%. Not quite the 50% I wanted in the best
case, but I'll take what I can get.
254->289fps in Zelda 3 on my E8400 now. There's another 15% hiding with
blargg's SMP and 5-10% with blargg's fast DSP, but they lose too much
accuracy. It'd put me at or below Snes9X accuracy, while still being 50%
slower.
SSE2 was performing worse this time, both on x86 and amd64, so I left
that optimization off.
So, barring a miracle, this is about the best it's going to get.
2010-09-03 11:37:36 +00:00
|
|
|
tile_height = regs.tile_size ? 4 : 3;
|
|
|
|
tile_width = hires ? 4 : tile_height;
|
2010-09-01 13:20:05 +00:00
|
|
|
|
Update to v068r10 release.
(there was no r09 release posted to the WIP thread)
byuu says:
It is feature-complete, but horizontal mosaic is less accurate. I have
an idea for a mosaic color ring buffer to get it equally accurate, but
I haven't implemented it yet. For now it's just a simple x & ~(mosaic >>
1) trick that is passable.
Hires blending was left out, as it's more processor intensive and
blargg's NTSC does a better job with that anyway.
There's some OPT vertical positioning issues in the SNES Test Program's
character test; Goodbye, Anthrox has some sort of fast CPU DMA issue;
etc.
Total speedup is a mere 13.5%. Not quite the 50% I wanted in the best
case, but I'll take what I can get.
254->289fps in Zelda 3 on my E8400 now. There's another 15% hiding with
blargg's SMP and 5-10% with blargg's fast DSP, but they lose too much
accuracy. It'd put me at or below Snes9X accuracy, while still being 50%
slower.
SSE2 was performing worse this time, both on x86 and amd64, so I left
that optimization off.
So, barring a miracle, this is about the best it's going to get.
2010-09-03 11:37:36 +00:00
|
|
|
mask_x = (tile_height == 4 ? width << 1 : width);
|
|
|
|
mask_y = mask_x;
|
2010-09-01 13:20:05 +00:00
|
|
|
if(regs.screen_size & 1) mask_x <<= 1;
|
|
|
|
if(regs.screen_size & 2) mask_y <<= 1;
|
|
|
|
mask_x--;
|
|
|
|
mask_y--;
|
|
|
|
|
Update to v068r10 release.
(there was no r09 release posted to the WIP thread)
byuu says:
It is feature-complete, but horizontal mosaic is less accurate. I have
an idea for a mosaic color ring buffer to get it equally accurate, but
I haven't implemented it yet. For now it's just a simple x & ~(mosaic >>
1) trick that is passable.
Hires blending was left out, as it's more processor intensive and
blargg's NTSC does a better job with that anyway.
There's some OPT vertical positioning issues in the SNES Test Program's
character test; Goodbye, Anthrox has some sort of fast CPU DMA issue;
etc.
Total speedup is a mere 13.5%. Not quite the 50% I wanted in the best
case, but I'll take what I can get.
254->289fps in Zelda 3 on my E8400 now. There's another 15% hiding with
blargg's SMP and 5-10% with blargg's fast DSP, but they lose too much
accuracy. It'd put me at or below Snes9X accuracy, while still being 50%
slower.
SSE2 was performing worse this time, both on x86 and amd64, so I left
that optimization off.
So, barring a miracle, this is about the best it's going to get.
2010-09-03 11:37:36 +00:00
|
|
|
scx = (regs.screen_size & 1 ? 32 << 5 : 0);
|
|
|
|
scy = (regs.screen_size & 2 ? 32 << 5 : 0);
|
2010-09-01 13:20:05 +00:00
|
|
|
if(regs.screen_size == 3) scy <<= 1;
|
Update to v068r10 release.
(there was no r09 release posted to the WIP thread)
byuu says:
It is feature-complete, but horizontal mosaic is less accurate. I have
an idea for a mosaic color ring buffer to get it equally accurate, but
I haven't implemented it yet. For now it's just a simple x & ~(mosaic >>
1) trick that is passable.
Hires blending was left out, as it's more processor intensive and
blargg's NTSC does a better job with that anyway.
There's some OPT vertical positioning issues in the SNES Test Program's
character test; Goodbye, Anthrox has some sort of fast CPU DMA issue;
etc.
Total speedup is a mere 13.5%. Not quite the 50% I wanted in the best
case, but I'll take what I can get.
254->289fps in Zelda 3 on my E8400 now. There's another 15% hiding with
blargg's SMP and 5-10% with blargg's fast DSP, but they lose too much
accuracy. It'd put me at or below Snes9X accuracy, while still being 50%
slower.
SSE2 was performing worse this time, both on x86 and amd64, so I left
that optimization off.
So, barring a miracle, this is about the best it's going to get.
2010-09-03 11:37:36 +00:00
|
|
|
}
|
2010-09-01 13:20:05 +00:00
|
|
|
|
2015-12-06 21:11:41 +00:00
|
|
|
auto PPU::Background::render() -> void {
|
Update to v068r10 release.
(there was no r09 release posted to the WIP thread)
byuu says:
It is feature-complete, but horizontal mosaic is less accurate. I have
an idea for a mosaic color ring buffer to get it equally accurate, but
I haven't implemented it yet. For now it's just a simple x & ~(mosaic >>
1) trick that is passable.
Hires blending was left out, as it's more processor intensive and
blargg's NTSC does a better job with that anyway.
There's some OPT vertical positioning issues in the SNES Test Program's
character test; Goodbye, Anthrox has some sort of fast CPU DMA issue;
etc.
Total speedup is a mere 13.5%. Not quite the 50% I wanted in the best
case, but I'll take what I can get.
254->289fps in Zelda 3 on my E8400 now. There's another 15% hiding with
blargg's SMP and 5-10% with blargg's fast DSP, but they lose too much
accuracy. It'd put me at or below Snes9X accuracy, while still being 50%
slower.
SSE2 was performing worse this time, both on x86 and amd64, so I left
that optimization off.
So, barring a miracle, this is about the best it's going to get.
2010-09-03 11:37:36 +00:00
|
|
|
if(regs.mode == Mode::Inactive) return;
|
|
|
|
if(regs.main_enable == false && regs.sub_enable == false) return;
|
|
|
|
|
|
|
|
if(regs.main_enable) window.render(0);
|
|
|
|
if(regs.sub_enable) window.render(1);
|
|
|
|
if(regs.mode == Mode::Mode7) return render_mode7();
|
|
|
|
|
2015-12-06 21:11:41 +00:00
|
|
|
uint priority0 = (priority0_enable ? regs.priority0 : 0);
|
|
|
|
uint priority1 = (priority1_enable ? regs.priority1 : 0);
|
2010-09-06 11:13:51 +00:00
|
|
|
if(priority0 + priority1 == 0) return;
|
|
|
|
|
2015-12-06 21:11:41 +00:00
|
|
|
uint mosaic_hcounter = 1;
|
|
|
|
uint mosaic_palette = 0;
|
|
|
|
uint mosaic_priority = 0;
|
|
|
|
uint mosaic_color = 0;
|
Update to v068r13 release.
byuu says:
Bug-fix night for the new PPUs.
Accuracy:
Fixed BG palette clamping, which fixes Taz-Mania.
Added blocking for CGRAM writes during active display, to match the
compatibility core. It really should override to the last fetched
palette color, I'll probably try that out soon enough.
Performance:
Mosaic should match the other renderers. Unfortunately, as suspected, it
murders speed. 290->275fps. It's now only 11fps faster, hardly worth it
at all. But the old rendering code is really awful, so maybe it's for
the best it gets refreshed.
It's really tough to understand why this is such a performance hit, it's
just a decrement+compare check four times per pixel. But yeah, it hits
it really, really hard.
Fixed a missing check in Mode4 offset-per-tile, fixes vertical alignment
of a test image in the SNES Test Program.
2010-09-05 13:22:26 +00:00
|
|
|
|
2015-12-06 21:11:41 +00:00
|
|
|
const uint bgpal_index = (self.regs.bgmode == 0 ? id << 5 : 0);
|
|
|
|
const uint pal_size = 2 << regs.mode;
|
|
|
|
const uint tile_mask = 0x0fff >> regs.mode;
|
|
|
|
const uint tiledata_index = regs.tiledata_addr >> (4 + regs.mode);
|
Update to v068r10 release.
(there was no r09 release posted to the WIP thread)
byuu says:
It is feature-complete, but horizontal mosaic is less accurate. I have
an idea for a mosaic color ring buffer to get it equally accurate, but
I haven't implemented it yet. For now it's just a simple x & ~(mosaic >>
1) trick that is passable.
Hires blending was left out, as it's more processor intensive and
blargg's NTSC does a better job with that anyway.
There's some OPT vertical positioning issues in the SNES Test Program's
character test; Goodbye, Anthrox has some sort of fast CPU DMA issue;
etc.
Total speedup is a mere 13.5%. Not quite the 50% I wanted in the best
case, but I'll take what I can get.
254->289fps in Zelda 3 on my E8400 now. There's another 15% hiding with
blargg's SMP and 5-10% with blargg's fast DSP, but they lose too much
accuracy. It'd put me at or below Snes9X accuracy, while still being 50%
slower.
SSE2 was performing worse this time, both on x86 and amd64, so I left
that optimization off.
So, barring a miracle, this is about the best it's going to get.
2010-09-03 11:37:36 +00:00
|
|
|
|
|
|
|
hscroll = regs.hoffset;
|
|
|
|
vscroll = regs.voffset;
|
2010-09-01 13:20:05 +00:00
|
|
|
|
2015-12-06 21:11:41 +00:00
|
|
|
uint y = (regs.mosaic == 0 ? self.vcounter() : mosaic_voffset);
|
2010-09-01 13:20:05 +00:00
|
|
|
if(hires) {
|
|
|
|
hscroll <<= 1;
|
|
|
|
if(self.regs.interlace) y = (y << 1) + self.field();
|
|
|
|
}
|
|
|
|
|
2015-12-06 21:11:41 +00:00
|
|
|
uint tile_pri, tile_num;
|
|
|
|
uint pal_index, pal_num;
|
|
|
|
uint hoffset, voffset, col;
|
2010-09-01 13:20:05 +00:00
|
|
|
bool mirror_x, mirror_y;
|
|
|
|
|
|
|
|
const bool is_opt_mode = (self.regs.bgmode == 2 || self.regs.bgmode == 4 || self.regs.bgmode == 6);
|
|
|
|
const bool is_direct_color_mode = (self.screen.regs.direct_color == true && id == ID::BG1 && (self.regs.bgmode == 3 || self.regs.bgmode == 4));
|
|
|
|
|
2015-12-06 21:11:41 +00:00
|
|
|
int x = 0 - (hscroll & 7);
|
2010-09-01 13:20:05 +00:00
|
|
|
while(x < width) {
|
|
|
|
hoffset = x + hscroll;
|
|
|
|
voffset = y + vscroll;
|
Update to v068r13 release.
byuu says:
Bug-fix night for the new PPUs.
Accuracy:
Fixed BG palette clamping, which fixes Taz-Mania.
Added blocking for CGRAM writes during active display, to match the
compatibility core. It really should override to the last fetched
palette color, I'll probably try that out soon enough.
Performance:
Mosaic should match the other renderers. Unfortunately, as suspected, it
murders speed. 290->275fps. It's now only 11fps faster, hardly worth it
at all. But the old rendering code is really awful, so maybe it's for
the best it gets refreshed.
It's really tough to understand why this is such a performance hit, it's
just a decrement+compare check four times per pixel. But yeah, it hits
it really, really hard.
Fixed a missing check in Mode4 offset-per-tile, fixes vertical alignment
of a test image in the SNES Test Program.
2010-09-05 13:22:26 +00:00
|
|
|
if(is_opt_mode) offset_per_tile(x, y, hoffset, voffset);
|
2010-09-01 13:20:05 +00:00
|
|
|
hoffset &= mask_x;
|
|
|
|
voffset &= mask_y;
|
|
|
|
|
Update to v068r10 release.
(there was no r09 release posted to the WIP thread)
byuu says:
It is feature-complete, but horizontal mosaic is less accurate. I have
an idea for a mosaic color ring buffer to get it equally accurate, but
I haven't implemented it yet. For now it's just a simple x & ~(mosaic >>
1) trick that is passable.
Hires blending was left out, as it's more processor intensive and
blargg's NTSC does a better job with that anyway.
There's some OPT vertical positioning issues in the SNES Test Program's
character test; Goodbye, Anthrox has some sort of fast CPU DMA issue;
etc.
Total speedup is a mere 13.5%. Not quite the 50% I wanted in the best
case, but I'll take what I can get.
254->289fps in Zelda 3 on my E8400 now. There's another 15% hiding with
blargg's SMP and 5-10% with blargg's fast DSP, but they lose too much
accuracy. It'd put me at or below Snes9X accuracy, while still being 50%
slower.
SSE2 was performing worse this time, both on x86 and amd64, so I left
that optimization off.
So, barring a miracle, this is about the best it's going to get.
2010-09-03 11:37:36 +00:00
|
|
|
tile_num = get_tile(hoffset, voffset);
|
2010-09-01 13:20:05 +00:00
|
|
|
mirror_y = tile_num & 0x8000;
|
|
|
|
mirror_x = tile_num & 0x4000;
|
2010-09-06 11:13:51 +00:00
|
|
|
tile_pri = tile_num & 0x2000 ? priority1 : priority0;
|
2010-09-01 13:20:05 +00:00
|
|
|
pal_num = (tile_num >> 10) & 7;
|
Update to v068r13 release.
byuu says:
Bug-fix night for the new PPUs.
Accuracy:
Fixed BG palette clamping, which fixes Taz-Mania.
Added blocking for CGRAM writes during active display, to match the
compatibility core. It really should override to the last fetched
palette color, I'll probably try that out soon enough.
Performance:
Mosaic should match the other renderers. Unfortunately, as suspected, it
murders speed. 290->275fps. It's now only 11fps faster, hardly worth it
at all. But the old rendering code is really awful, so maybe it's for
the best it gets refreshed.
It's really tough to understand why this is such a performance hit, it's
just a decrement+compare check four times per pixel. But yeah, it hits
it really, really hard.
Fixed a missing check in Mode4 offset-per-tile, fixes vertical alignment
of a test image in the SNES Test Program.
2010-09-05 13:22:26 +00:00
|
|
|
pal_index = (bgpal_index + (pal_num << pal_size)) & 0xff;
|
2010-09-01 13:20:05 +00:00
|
|
|
|
Update to v068r13 release.
byuu says:
Bug-fix night for the new PPUs.
Accuracy:
Fixed BG palette clamping, which fixes Taz-Mania.
Added blocking for CGRAM writes during active display, to match the
compatibility core. It really should override to the last fetched
palette color, I'll probably try that out soon enough.
Performance:
Mosaic should match the other renderers. Unfortunately, as suspected, it
murders speed. 290->275fps. It's now only 11fps faster, hardly worth it
at all. But the old rendering code is really awful, so maybe it's for
the best it gets refreshed.
It's really tough to understand why this is such a performance hit, it's
just a decrement+compare check four times per pixel. But yeah, it hits
it really, really hard.
Fixed a missing check in Mode4 offset-per-tile, fixes vertical alignment
of a test image in the SNES Test Program.
2010-09-05 13:22:26 +00:00
|
|
|
if(tile_width == 4 && (bool)(hoffset & 8) != mirror_x) tile_num += 1;
|
|
|
|
if(tile_height == 4 && (bool)(voffset & 8) != mirror_y) tile_num += 16;
|
|
|
|
tile_num = ((tile_num & 0x03ff) + tiledata_index) & tile_mask;
|
2010-09-01 13:20:05 +00:00
|
|
|
|
|
|
|
if(mirror_y) voffset ^= 7;
|
2015-12-06 21:11:41 +00:00
|
|
|
uint mirror_xmask = !mirror_x ? 0 : 7;
|
2010-09-01 13:20:05 +00:00
|
|
|
|
2013-05-05 09:21:30 +00:00
|
|
|
uint8* tiledata = self.cache.tile(regs.mode, tile_num);
|
2010-09-01 13:20:05 +00:00
|
|
|
tiledata += ((voffset & 7) * 8);
|
|
|
|
|
2015-12-06 21:11:41 +00:00
|
|
|
for(uint n = 0; n < 8; n++, x++) {
|
Update to v068r13 release.
byuu says:
Bug-fix night for the new PPUs.
Accuracy:
Fixed BG palette clamping, which fixes Taz-Mania.
Added blocking for CGRAM writes during active display, to match the
compatibility core. It really should override to the last fetched
palette color, I'll probably try that out soon enough.
Performance:
Mosaic should match the other renderers. Unfortunately, as suspected, it
murders speed. 290->275fps. It's now only 11fps faster, hardly worth it
at all. But the old rendering code is really awful, so maybe it's for
the best it gets refreshed.
It's really tough to understand why this is such a performance hit, it's
just a decrement+compare check four times per pixel. But yeah, it hits
it really, really hard.
Fixed a missing check in Mode4 offset-per-tile, fixes vertical alignment
of a test image in the SNES Test Program.
2010-09-05 13:22:26 +00:00
|
|
|
if(x & width) continue;
|
|
|
|
if(--mosaic_hcounter == 0) {
|
|
|
|
mosaic_hcounter = regs.mosaic + 1;
|
|
|
|
mosaic_palette = tiledata[n ^ mirror_xmask];
|
|
|
|
mosaic_priority = tile_pri;
|
Update to v068r10 release.
(there was no r09 release posted to the WIP thread)
byuu says:
It is feature-complete, but horizontal mosaic is less accurate. I have
an idea for a mosaic color ring buffer to get it equally accurate, but
I haven't implemented it yet. For now it's just a simple x & ~(mosaic >>
1) trick that is passable.
Hires blending was left out, as it's more processor intensive and
blargg's NTSC does a better job with that anyway.
There's some OPT vertical positioning issues in the SNES Test Program's
character test; Goodbye, Anthrox has some sort of fast CPU DMA issue;
etc.
Total speedup is a mere 13.5%. Not quite the 50% I wanted in the best
case, but I'll take what I can get.
254->289fps in Zelda 3 on my E8400 now. There's another 15% hiding with
blargg's SMP and 5-10% with blargg's fast DSP, but they lose too much
accuracy. It'd put me at or below Snes9X accuracy, while still being 50%
slower.
SSE2 was performing worse this time, both on x86 and amd64, so I left
that optimization off.
So, barring a miracle, this is about the best it's going to get.
2010-09-03 11:37:36 +00:00
|
|
|
if(is_direct_color_mode) {
|
Update to v068r13 release.
byuu says:
Bug-fix night for the new PPUs.
Accuracy:
Fixed BG palette clamping, which fixes Taz-Mania.
Added blocking for CGRAM writes during active display, to match the
compatibility core. It really should override to the last fetched
palette color, I'll probably try that out soon enough.
Performance:
Mosaic should match the other renderers. Unfortunately, as suspected, it
murders speed. 290->275fps. It's now only 11fps faster, hardly worth it
at all. But the old rendering code is really awful, so maybe it's for
the best it gets refreshed.
It's really tough to understand why this is such a performance hit, it's
just a decrement+compare check four times per pixel. But yeah, it hits
it really, really hard.
Fixed a missing check in Mode4 offset-per-tile, fixes vertical alignment
of a test image in the SNES Test Program.
2010-09-05 13:22:26 +00:00
|
|
|
mosaic_color = self.screen.get_direct_color(pal_num, mosaic_palette);
|
Update to v068r10 release.
(there was no r09 release posted to the WIP thread)
byuu says:
It is feature-complete, but horizontal mosaic is less accurate. I have
an idea for a mosaic color ring buffer to get it equally accurate, but
I haven't implemented it yet. For now it's just a simple x & ~(mosaic >>
1) trick that is passable.
Hires blending was left out, as it's more processor intensive and
blargg's NTSC does a better job with that anyway.
There's some OPT vertical positioning issues in the SNES Test Program's
character test; Goodbye, Anthrox has some sort of fast CPU DMA issue;
etc.
Total speedup is a mere 13.5%. Not quite the 50% I wanted in the best
case, but I'll take what I can get.
254->289fps in Zelda 3 on my E8400 now. There's another 15% hiding with
blargg's SMP and 5-10% with blargg's fast DSP, but they lose too much
accuracy. It'd put me at or below Snes9X accuracy, while still being 50%
slower.
SSE2 was performing worse this time, both on x86 and amd64, so I left
that optimization off.
So, barring a miracle, this is about the best it's going to get.
2010-09-03 11:37:36 +00:00
|
|
|
} else {
|
Update to v068r13 release.
byuu says:
Bug-fix night for the new PPUs.
Accuracy:
Fixed BG palette clamping, which fixes Taz-Mania.
Added blocking for CGRAM writes during active display, to match the
compatibility core. It really should override to the last fetched
palette color, I'll probably try that out soon enough.
Performance:
Mosaic should match the other renderers. Unfortunately, as suspected, it
murders speed. 290->275fps. It's now only 11fps faster, hardly worth it
at all. But the old rendering code is really awful, so maybe it's for
the best it gets refreshed.
It's really tough to understand why this is such a performance hit, it's
just a decrement+compare check four times per pixel. But yeah, it hits
it really, really hard.
Fixed a missing check in Mode4 offset-per-tile, fixes vertical alignment
of a test image in the SNES Test Program.
2010-09-05 13:22:26 +00:00
|
|
|
mosaic_color = self.screen.get_palette(pal_index + mosaic_palette);
|
Update to v068r10 release.
(there was no r09 release posted to the WIP thread)
byuu says:
It is feature-complete, but horizontal mosaic is less accurate. I have
an idea for a mosaic color ring buffer to get it equally accurate, but
I haven't implemented it yet. For now it's just a simple x & ~(mosaic >>
1) trick that is passable.
Hires blending was left out, as it's more processor intensive and
blargg's NTSC does a better job with that anyway.
There's some OPT vertical positioning issues in the SNES Test Program's
character test; Goodbye, Anthrox has some sort of fast CPU DMA issue;
etc.
Total speedup is a mere 13.5%. Not quite the 50% I wanted in the best
case, but I'll take what I can get.
254->289fps in Zelda 3 on my E8400 now. There's another 15% hiding with
blargg's SMP and 5-10% with blargg's fast DSP, but they lose too much
accuracy. It'd put me at or below Snes9X accuracy, while still being 50%
slower.
SSE2 was performing worse this time, both on x86 and amd64, so I left
that optimization off.
So, barring a miracle, this is about the best it's going to get.
2010-09-03 11:37:36 +00:00
|
|
|
}
|
Update to v068r13 release.
byuu says:
Bug-fix night for the new PPUs.
Accuracy:
Fixed BG palette clamping, which fixes Taz-Mania.
Added blocking for CGRAM writes during active display, to match the
compatibility core. It really should override to the last fetched
palette color, I'll probably try that out soon enough.
Performance:
Mosaic should match the other renderers. Unfortunately, as suspected, it
murders speed. 290->275fps. It's now only 11fps faster, hardly worth it
at all. But the old rendering code is really awful, so maybe it's for
the best it gets refreshed.
It's really tough to understand why this is such a performance hit, it's
just a decrement+compare check four times per pixel. But yeah, it hits
it really, really hard.
Fixed a missing check in Mode4 offset-per-tile, fixes vertical alignment
of a test image in the SNES Test Program.
2010-09-05 13:22:26 +00:00
|
|
|
}
|
|
|
|
if(mosaic_palette == 0) continue;
|
Update to v068r10 release.
(there was no r09 release posted to the WIP thread)
byuu says:
It is feature-complete, but horizontal mosaic is less accurate. I have
an idea for a mosaic color ring buffer to get it equally accurate, but
I haven't implemented it yet. For now it's just a simple x & ~(mosaic >>
1) trick that is passable.
Hires blending was left out, as it's more processor intensive and
blargg's NTSC does a better job with that anyway.
There's some OPT vertical positioning issues in the SNES Test Program's
character test; Goodbye, Anthrox has some sort of fast CPU DMA issue;
etc.
Total speedup is a mere 13.5%. Not quite the 50% I wanted in the best
case, but I'll take what I can get.
254->289fps in Zelda 3 on my E8400 now. There's another 15% hiding with
blargg's SMP and 5-10% with blargg's fast DSP, but they lose too much
accuracy. It'd put me at or below Snes9X accuracy, while still being 50%
slower.
SSE2 was performing worse this time, both on x86 and amd64, so I left
that optimization off.
So, barring a miracle, this is about the best it's going to get.
2010-09-03 11:37:36 +00:00
|
|
|
|
Update to v068r13 release.
byuu says:
Bug-fix night for the new PPUs.
Accuracy:
Fixed BG palette clamping, which fixes Taz-Mania.
Added blocking for CGRAM writes during active display, to match the
compatibility core. It really should override to the last fetched
palette color, I'll probably try that out soon enough.
Performance:
Mosaic should match the other renderers. Unfortunately, as suspected, it
murders speed. 290->275fps. It's now only 11fps faster, hardly worth it
at all. But the old rendering code is really awful, so maybe it's for
the best it gets refreshed.
It's really tough to understand why this is such a performance hit, it's
just a decrement+compare check four times per pixel. But yeah, it hits
it really, really hard.
Fixed a missing check in Mode4 offset-per-tile, fixes vertical alignment
of a test image in the SNES Test Program.
2010-09-05 13:22:26 +00:00
|
|
|
if(hires == false) {
|
|
|
|
if(regs.main_enable && !window.main[x]) self.screen.output.plot_main(x, mosaic_color, mosaic_priority, id);
|
|
|
|
if(regs.sub_enable && !window.sub[x]) self.screen.output.plot_sub(x, mosaic_color, mosaic_priority, id);
|
|
|
|
} else {
|
2015-12-06 21:11:41 +00:00
|
|
|
int half_x = x >> 1;
|
Update to v068r13 release.
byuu says:
Bug-fix night for the new PPUs.
Accuracy:
Fixed BG palette clamping, which fixes Taz-Mania.
Added blocking for CGRAM writes during active display, to match the
compatibility core. It really should override to the last fetched
palette color, I'll probably try that out soon enough.
Performance:
Mosaic should match the other renderers. Unfortunately, as suspected, it
murders speed. 290->275fps. It's now only 11fps faster, hardly worth it
at all. But the old rendering code is really awful, so maybe it's for
the best it gets refreshed.
It's really tough to understand why this is such a performance hit, it's
just a decrement+compare check four times per pixel. But yeah, it hits
it really, really hard.
Fixed a missing check in Mode4 offset-per-tile, fixes vertical alignment
of a test image in the SNES Test Program.
2010-09-05 13:22:26 +00:00
|
|
|
if(x & 1) {
|
|
|
|
if(regs.main_enable && !window.main[half_x]) self.screen.output.plot_main(half_x, mosaic_color, mosaic_priority, id);
|
Update to v068r10 release.
(there was no r09 release posted to the WIP thread)
byuu says:
It is feature-complete, but horizontal mosaic is less accurate. I have
an idea for a mosaic color ring buffer to get it equally accurate, but
I haven't implemented it yet. For now it's just a simple x & ~(mosaic >>
1) trick that is passable.
Hires blending was left out, as it's more processor intensive and
blargg's NTSC does a better job with that anyway.
There's some OPT vertical positioning issues in the SNES Test Program's
character test; Goodbye, Anthrox has some sort of fast CPU DMA issue;
etc.
Total speedup is a mere 13.5%. Not quite the 50% I wanted in the best
case, but I'll take what I can get.
254->289fps in Zelda 3 on my E8400 now. There's another 15% hiding with
blargg's SMP and 5-10% with blargg's fast DSP, but they lose too much
accuracy. It'd put me at or below Snes9X accuracy, while still being 50%
slower.
SSE2 was performing worse this time, both on x86 and amd64, so I left
that optimization off.
So, barring a miracle, this is about the best it's going to get.
2010-09-03 11:37:36 +00:00
|
|
|
} else {
|
Update to v068r13 release.
byuu says:
Bug-fix night for the new PPUs.
Accuracy:
Fixed BG palette clamping, which fixes Taz-Mania.
Added blocking for CGRAM writes during active display, to match the
compatibility core. It really should override to the last fetched
palette color, I'll probably try that out soon enough.
Performance:
Mosaic should match the other renderers. Unfortunately, as suspected, it
murders speed. 290->275fps. It's now only 11fps faster, hardly worth it
at all. But the old rendering code is really awful, so maybe it's for
the best it gets refreshed.
It's really tough to understand why this is such a performance hit, it's
just a decrement+compare check four times per pixel. But yeah, it hits
it really, really hard.
Fixed a missing check in Mode4 offset-per-tile, fixes vertical alignment
of a test image in the SNES Test Program.
2010-09-05 13:22:26 +00:00
|
|
|
if(regs.sub_enable && !window.sub[half_x]) self.screen.output.plot_sub(half_x, mosaic_color, mosaic_priority, id);
|
Update to v068r10 release.
(there was no r09 release posted to the WIP thread)
byuu says:
It is feature-complete, but horizontal mosaic is less accurate. I have
an idea for a mosaic color ring buffer to get it equally accurate, but
I haven't implemented it yet. For now it's just a simple x & ~(mosaic >>
1) trick that is passable.
Hires blending was left out, as it's more processor intensive and
blargg's NTSC does a better job with that anyway.
There's some OPT vertical positioning issues in the SNES Test Program's
character test; Goodbye, Anthrox has some sort of fast CPU DMA issue;
etc.
Total speedup is a mere 13.5%. Not quite the 50% I wanted in the best
case, but I'll take what I can get.
254->289fps in Zelda 3 on my E8400 now. There's another 15% hiding with
blargg's SMP and 5-10% with blargg's fast DSP, but they lose too much
accuracy. It'd put me at or below Snes9X accuracy, while still being 50%
slower.
SSE2 was performing worse this time, both on x86 and amd64, so I left
that optimization off.
So, barring a miracle, this is about the best it's going to get.
2010-09-03 11:37:36 +00:00
|
|
|
}
|
2010-09-01 13:20:05 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|