2010-08-09 13:28:56 +00:00
|
|
|
#include "mode7.cpp"
|
2017-09-05 00:56:52 +00:00
|
|
|
uint4 PPU::Background::Mosaic::size;
|
|
|
|
|
|
|
|
auto PPU::Background::hires() const -> bool {
|
|
|
|
return ppu.io.bgMode == 5 || ppu.io.bgMode == 6;
|
|
|
|
}
|
2010-08-09 13:28:56 +00:00
|
|
|
|
2016-06-14 10:51:54 +00:00
|
|
|
auto PPU::Background::voffset() const -> uint16 {
|
2017-09-05 00:56:52 +00:00
|
|
|
return mosaic.enable ? latch.voffset : io.voffset;
|
Update to v095r05 release.
byuu says:
Changelog:
- GBA: lots of emulation improvements
- PPU PRAM is 16-bits wide
- DMA masks &~1/Half, &~3/Word
- VRAM OBJ 8-bit writes are ignored
- OAM 8-bit writes are ignored
- BGnCNT unused bits are writable*
- BG(0,1)CNT can't set the d13
- BLDALPHA is readable (fixes Donkey Kong Country, etc)
- SNES: lots of code cleanups
- sfc/chip => sfc/coprocessor
- UI: save most recent controller selection
GBA test scores: 1552/1552, 37/38, 1020/1260
(* forgot to add the value to the read function, so endrift's I/O tests
for them will fail. Fixed locally.)
Note: SNES is the only system with multiple controller/expansion port
options, and as such is the only one with a "None" option. Because it's
shared by the controller and expansion port, it ends up sorted first in
the list. This means that on your first run, you'll need to go to Super
Famicom->Controller Port 1 and select "Gamepad", otherwise input won't
work.
Also note that changing the expansion port device requires loading a new
cart. Unlike controllers, you aren't meant to hotplug expansion port
devices.
2015-11-12 10:15:03 +00:00
|
|
|
}
|
|
|
|
|
2016-06-14 10:51:54 +00:00
|
|
|
auto PPU::Background::hoffset() const -> uint16 {
|
2017-09-05 00:56:52 +00:00
|
|
|
return mosaic.enable ? latch.hoffset : io.hoffset;
|
2012-05-09 23:35:29 +00:00
|
|
|
}
|
|
|
|
|
2011-06-05 03:45:04 +00:00
|
|
|
//V = 0, H = 0
|
Update to v095r05 release.
byuu says:
Changelog:
- GBA: lots of emulation improvements
- PPU PRAM is 16-bits wide
- DMA masks &~1/Half, &~3/Word
- VRAM OBJ 8-bit writes are ignored
- OAM 8-bit writes are ignored
- BGnCNT unused bits are writable*
- BG(0,1)CNT can't set the d13
- BLDALPHA is readable (fixes Donkey Kong Country, etc)
- SNES: lots of code cleanups
- sfc/chip => sfc/coprocessor
- UI: save most recent controller selection
GBA test scores: 1552/1552, 37/38, 1020/1260
(* forgot to add the value to the read function, so endrift's I/O tests
for them will fail. Fixed locally.)
Note: SNES is the only system with multiple controller/expansion port
options, and as such is the only one with a "None" option. Because it's
shared by the controller and expansion port, it ends up sorted first in
the list. This means that on your first run, you'll need to go to Super
Famicom->Controller Port 1 and select "Gamepad", otherwise input won't
work.
Also note that changing the expansion port device requires loading a new
cart. Unlike controllers, you aren't meant to hotplug expansion port
devices.
2015-11-12 10:15:03 +00:00
|
|
|
auto PPU::Background::frame() -> void {
|
2010-08-28 09:47:06 +00:00
|
|
|
}
|
|
|
|
|
2011-06-05 03:45:04 +00:00
|
|
|
//H = 0
|
Update to v095r05 release.
byuu says:
Changelog:
- GBA: lots of emulation improvements
- PPU PRAM is 16-bits wide
- DMA masks &~1/Half, &~3/Word
- VRAM OBJ 8-bit writes are ignored
- OAM 8-bit writes are ignored
- BGnCNT unused bits are writable*
- BG(0,1)CNT can't set the d13
- BLDALPHA is readable (fixes Donkey Kong Country, etc)
- SNES: lots of code cleanups
- sfc/chip => sfc/coprocessor
- UI: save most recent controller selection
GBA test scores: 1552/1552, 37/38, 1020/1260
(* forgot to add the value to the read function, so endrift's I/O tests
for them will fail. Fixed locally.)
Note: SNES is the only system with multiple controller/expansion port
options, and as such is the only one with a "None" option. Because it's
shared by the controller and expansion port, it ends up sorted first in
the list. This means that on your first run, you'll need to go to Super
Famicom->Controller Port 1 and select "Gamepad", otherwise input won't
work.
Also note that changing the expansion port device requires loading a new
cart. Unlike controllers, you aren't meant to hotplug expansion port
devices.
2015-11-12 10:15:03 +00:00
|
|
|
auto PPU::Background::scanline() -> void {
|
2011-06-05 03:45:04 +00:00
|
|
|
}
|
|
|
|
|
2012-05-09 23:35:29 +00:00
|
|
|
//H = 28
|
Update to v095r05 release.
byuu says:
Changelog:
- GBA: lots of emulation improvements
- PPU PRAM is 16-bits wide
- DMA masks &~1/Half, &~3/Word
- VRAM OBJ 8-bit writes are ignored
- OAM 8-bit writes are ignored
- BGnCNT unused bits are writable*
- BG(0,1)CNT can't set the d13
- BLDALPHA is readable (fixes Donkey Kong Country, etc)
- SNES: lots of code cleanups
- sfc/chip => sfc/coprocessor
- UI: save most recent controller selection
GBA test scores: 1552/1552, 37/38, 1020/1260
(* forgot to add the value to the read function, so endrift's I/O tests
for them will fail. Fixed locally.)
Note: SNES is the only system with multiple controller/expansion port
options, and as such is the only one with a "None" option. Because it's
shared by the controller and expansion port, it ends up sorted first in
the list. This means that on your first run, you'll need to go to Super
Famicom->Controller Port 1 and select "Gamepad", otherwise input won't
work.
Also note that changing the expansion port device requires loading a new
cart. Unlike controllers, you aren't meant to hotplug expansion port
devices.
2015-11-12 10:15:03 +00:00
|
|
|
auto PPU::Background::begin() -> void {
|
Update to v068r14 release.
byuu says:
Holy hell, that was a total brain twister. After hours of crazy bit
twiddling and debug printf's, I finally figured out how to allow both
lores and hires scrolling in the accurate PPU renderer. In the process,
I modified the main loop to run from -7 to 255, regardless of the hires
setting, and perform X adjustment inside the tile fetching. This fixed
a strange main/subscreen misalignment issue, so I was able to restore
the proper sub-then-main rendering for the final screen output stage.
Code looks a good bit cleaner this way overall.
I also added load state and save state menus to the tools menu, so you
can use the menubar to load and save to ten slots. I am thinking that
I should nuke the icons. As pretty as they are, it's getting tiresome
trying to find icons for everything, I have no pictures to represent
loading or saving a slot, nor to represent individual slots. I'll just
stick to radios and checkboxes.
2010-09-05 23:08:03 +00:00
|
|
|
x = -7;
|
2016-06-14 10:51:54 +00:00
|
|
|
y = ppu.vcounter();
|
Update to v068r12 release.
(there was no r11 release posted to the WIP thread)
byuu says:
This took ten hours of mind boggling insanity to pull off.
It upgrades the S-PPU dot-based renderer to fetch one tile, and then
output all of its pixels before fetching again. It sounds easy enough,
but it's insanely difficult. I ended up taking one small shortcut, in
that rather than fetch at -7, I fetch at the first instance where a tile
is needed to plot to x=0. So if you have {-3 to +4 } as a tile, it
fetches at -3. That won't work so well on hardware, if two BGs fetch at
the same X offset, they won't have time.
I have had no luck staggering the reads at BG1=-7, BG3=-5, etc. While
I can shift and fetch just fine, what happens is that when a new tile is
fetched in, that gives a new palette, priority, etc; and this ends up
happening between two tiles which results in the right-most edges of the
screen ending up with the wrong colors and such.
Offset-per-tile is cheap as always. Although looking at it, I'm not sure
how BG3 could pre-fetch, especially with the way one or two OPT modes
can fetch two tiles.
There's no magic in Hoffset caching yet, so the SMW1 pixel issue is
still there.
Mode 7 got a bugfix, it was off-by-one horizontally from the mosaic
code. After re-designing the BG mosaic, I ended up needing a separate
mosaic for Mode7, and in the process I fixed that bug. The obvious
change is that the Chrono Trigger Mode7->Mode2 transition doesn't cause
the pendulum to jump anymore.
Windows were simplified just a tad. The range testing is shared for all
modes now. Ironically, it's a bit slower, but I'll take less code over
more speed for the accuracy core.
Speaking of speed, because there's so much less calculations per pixel
for BGs, performance for the entire emulator has gone up by 30% in the
accuracy core. Pretty neat overall, I can maintain 60fps in all but,
yeah you can guess can't you?
2010-09-04 03:36:03 +00:00
|
|
|
|
2017-09-05 00:56:52 +00:00
|
|
|
tileCounter = 7 - (io.hoffset & 7) << hires();
|
|
|
|
for(auto& word : data) word = 0;
|
|
|
|
|
2011-06-05 03:25:24 +00:00
|
|
|
if(y == 1) {
|
2017-09-05 00:56:52 +00:00
|
|
|
mosaic.vcounter = mosaic.size + 1;
|
2011-06-05 03:25:24 +00:00
|
|
|
mosaic.voffset = 1;
|
Update to v099r14 release.
byuu says:
Changelog:
- (u)int(max,ptr) abbreviations removed; use _t suffix now [didn't feel
like they were contributing enough to be worth it]
- cleaned up nall::integer,natural,real functionality
- toInteger, toNatural, toReal for parsing strings to numbers
- fromInteger, fromNatural, fromReal for creating strings from numbers
- (string,Markup::Node,SQL-based-classes)::(integer,natural,real)
left unchanged
- template<typename T> numeral(T value, long padding, char padchar)
-> string for print() formatting
- deduces integer,natural,real based on T ... cast the value if you
want to override
- there still exists binary,octal,hex,pointer for explicit print()
formatting
- lstring -> string_vector [but using lstring = string_vector; is
declared]
- would be nice to remove the using lstring eventually ... but that'd
probably require 10,000 lines of changes >_>
- format -> string_format [no using here; format was too ambiguous]
- using integer = Integer<sizeof(int)*8>; and using natural =
Natural<sizeof(uint)*8>; declared
- for consistency with boolean. These three are meant for creating
zero-initialized values implicitly (various uses)
- R65816::io() -> idle() and SPC700::io() -> idle() [more clear; frees
up struct IO {} io; naming]
- SFC CPU, PPU, SMP use struct IO {} io; over struct (Status,Registers) {}
(status,registers); now
- still some CPU::Status status values ... they didn't really fit into
IO functionality ... will have to think about this more
- SFC CPU, PPU, SMP now use step() exclusively instead of addClocks()
calling into step()
- SFC CPU joypad1_bits, joypad2_bits were unused; killed them
- SFC PPU CGRAM moved into PPU::Screen; since nothing else uses it
- SFC PPU OAM moved into PPU::Object; since nothing else uses it
- the raw uint8[544] array is gone. OAM::read() constructs values from
the OAM::Object[512] table now
- this avoids having to determine how we want to sub-divide the two
OAM memory sections
- this also eliminates the OAM::synchronize() functionality
- probably more I'm forgetting
The FPS fluctuations are driving me insane. This WIP went from 128fps to
137fps. Settled on 133.5fps for the final build. But nothing I changed
should have affected performance at all. This level of fluctuation makes
it damn near impossible to know whether I'm speeding things up or slowing
things down with changes.
2016-07-01 11:50:32 +00:00
|
|
|
latch.hoffset = io.hoffset;
|
|
|
|
latch.voffset = io.voffset;
|
2011-06-05 03:25:24 +00:00
|
|
|
} else if(--mosaic.vcounter == 0) {
|
2017-09-05 00:56:52 +00:00
|
|
|
mosaic.vcounter = mosaic.size + 1;
|
|
|
|
mosaic.voffset += mosaic.size + 1;
|
Update to v099r14 release.
byuu says:
Changelog:
- (u)int(max,ptr) abbreviations removed; use _t suffix now [didn't feel
like they were contributing enough to be worth it]
- cleaned up nall::integer,natural,real functionality
- toInteger, toNatural, toReal for parsing strings to numbers
- fromInteger, fromNatural, fromReal for creating strings from numbers
- (string,Markup::Node,SQL-based-classes)::(integer,natural,real)
left unchanged
- template<typename T> numeral(T value, long padding, char padchar)
-> string for print() formatting
- deduces integer,natural,real based on T ... cast the value if you
want to override
- there still exists binary,octal,hex,pointer for explicit print()
formatting
- lstring -> string_vector [but using lstring = string_vector; is
declared]
- would be nice to remove the using lstring eventually ... but that'd
probably require 10,000 lines of changes >_>
- format -> string_format [no using here; format was too ambiguous]
- using integer = Integer<sizeof(int)*8>; and using natural =
Natural<sizeof(uint)*8>; declared
- for consistency with boolean. These three are meant for creating
zero-initialized values implicitly (various uses)
- R65816::io() -> idle() and SPC700::io() -> idle() [more clear; frees
up struct IO {} io; naming]
- SFC CPU, PPU, SMP use struct IO {} io; over struct (Status,Registers) {}
(status,registers); now
- still some CPU::Status status values ... they didn't really fit into
IO functionality ... will have to think about this more
- SFC CPU, PPU, SMP now use step() exclusively instead of addClocks()
calling into step()
- SFC CPU joypad1_bits, joypad2_bits were unused; killed them
- SFC PPU CGRAM moved into PPU::Screen; since nothing else uses it
- SFC PPU OAM moved into PPU::Object; since nothing else uses it
- the raw uint8[544] array is gone. OAM::read() constructs values from
the OAM::Object[512] table now
- this avoids having to determine how we want to sub-divide the two
OAM memory sections
- this also eliminates the OAM::synchronize() functionality
- probably more I'm forgetting
The FPS fluctuations are driving me insane. This WIP went from 128fps to
137fps. Settled on 133.5fps for the final build. But nothing I changed
should have affected performance at all. This level of fluctuation makes
it damn near impossible to know whether I'm speeding things up or slowing
things down with changes.
2016-07-01 11:50:32 +00:00
|
|
|
latch.hoffset = io.hoffset;
|
|
|
|
latch.voffset = io.voffset;
|
2010-08-28 09:47:06 +00:00
|
|
|
}
|
|
|
|
|
2017-09-05 00:56:52 +00:00
|
|
|
mosaic.hcounter = mosaic.size + 1;
|
2011-06-05 03:25:24 +00:00
|
|
|
mosaic.hoffset = 0;
|
2011-06-05 03:45:04 +00:00
|
|
|
|
Update to v099r14 release.
byuu says:
Changelog:
- (u)int(max,ptr) abbreviations removed; use _t suffix now [didn't feel
like they were contributing enough to be worth it]
- cleaned up nall::integer,natural,real functionality
- toInteger, toNatural, toReal for parsing strings to numbers
- fromInteger, fromNatural, fromReal for creating strings from numbers
- (string,Markup::Node,SQL-based-classes)::(integer,natural,real)
left unchanged
- template<typename T> numeral(T value, long padding, char padchar)
-> string for print() formatting
- deduces integer,natural,real based on T ... cast the value if you
want to override
- there still exists binary,octal,hex,pointer for explicit print()
formatting
- lstring -> string_vector [but using lstring = string_vector; is
declared]
- would be nice to remove the using lstring eventually ... but that'd
probably require 10,000 lines of changes >_>
- format -> string_format [no using here; format was too ambiguous]
- using integer = Integer<sizeof(int)*8>; and using natural =
Natural<sizeof(uint)*8>; declared
- for consistency with boolean. These three are meant for creating
zero-initialized values implicitly (various uses)
- R65816::io() -> idle() and SPC700::io() -> idle() [more clear; frees
up struct IO {} io; naming]
- SFC CPU, PPU, SMP use struct IO {} io; over struct (Status,Registers) {}
(status,registers); now
- still some CPU::Status status values ... they didn't really fit into
IO functionality ... will have to think about this more
- SFC CPU, PPU, SMP now use step() exclusively instead of addClocks()
calling into step()
- SFC CPU joypad1_bits, joypad2_bits were unused; killed them
- SFC PPU CGRAM moved into PPU::Screen; since nothing else uses it
- SFC PPU OAM moved into PPU::Object; since nothing else uses it
- the raw uint8[544] array is gone. OAM::read() constructs values from
the OAM::Object[512] table now
- this avoids having to determine how we want to sub-divide the two
OAM memory sections
- this also eliminates the OAM::synchronize() functionality
- probably more I'm forgetting
The FPS fluctuations are driving me insane. This WIP went from 128fps to
137fps. Settled on 133.5fps for the final build. But nothing I changed
should have affected performance at all. This level of fluctuation makes
it damn near impossible to know whether I'm speeding things up or slowing
things down with changes.
2016-07-01 11:50:32 +00:00
|
|
|
if(io.mode == Mode::Mode7) return beginMode7();
|
2017-09-05 00:56:52 +00:00
|
|
|
if(mosaic.size == 0) {
|
Update to v099r14 release.
byuu says:
Changelog:
- (u)int(max,ptr) abbreviations removed; use _t suffix now [didn't feel
like they were contributing enough to be worth it]
- cleaned up nall::integer,natural,real functionality
- toInteger, toNatural, toReal for parsing strings to numbers
- fromInteger, fromNatural, fromReal for creating strings from numbers
- (string,Markup::Node,SQL-based-classes)::(integer,natural,real)
left unchanged
- template<typename T> numeral(T value, long padding, char padchar)
-> string for print() formatting
- deduces integer,natural,real based on T ... cast the value if you
want to override
- there still exists binary,octal,hex,pointer for explicit print()
formatting
- lstring -> string_vector [but using lstring = string_vector; is
declared]
- would be nice to remove the using lstring eventually ... but that'd
probably require 10,000 lines of changes >_>
- format -> string_format [no using here; format was too ambiguous]
- using integer = Integer<sizeof(int)*8>; and using natural =
Natural<sizeof(uint)*8>; declared
- for consistency with boolean. These three are meant for creating
zero-initialized values implicitly (various uses)
- R65816::io() -> idle() and SPC700::io() -> idle() [more clear; frees
up struct IO {} io; naming]
- SFC CPU, PPU, SMP use struct IO {} io; over struct (Status,Registers) {}
(status,registers); now
- still some CPU::Status status values ... they didn't really fit into
IO functionality ... will have to think about this more
- SFC CPU, PPU, SMP now use step() exclusively instead of addClocks()
calling into step()
- SFC CPU joypad1_bits, joypad2_bits were unused; killed them
- SFC PPU CGRAM moved into PPU::Screen; since nothing else uses it
- SFC PPU OAM moved into PPU::Object; since nothing else uses it
- the raw uint8[544] array is gone. OAM::read() constructs values from
the OAM::Object[512] table now
- this avoids having to determine how we want to sub-divide the two
OAM memory sections
- this also eliminates the OAM::synchronize() functionality
- probably more I'm forgetting
The FPS fluctuations are driving me insane. This WIP went from 128fps to
137fps. Settled on 133.5fps for the final build. But nothing I changed
should have affected performance at all. This level of fluctuation makes
it damn near impossible to know whether I'm speeding things up or slowing
things down with changes.
2016-07-01 11:50:32 +00:00
|
|
|
latch.hoffset = io.hoffset;
|
|
|
|
latch.voffset = io.voffset;
|
2011-06-05 03:45:04 +00:00
|
|
|
}
|
2010-08-09 13:28:56 +00:00
|
|
|
}
|
|
|
|
|
2016-06-14 10:51:54 +00:00
|
|
|
auto PPU::Background::getTile() -> void {
|
2017-09-05 00:56:52 +00:00
|
|
|
uint paletteOffset = ppu.io.bgMode == 0 ? id << 5 : 0;
|
Update to v106r33 release.
byuu says:
Changelog:
- nall/GNUmakefile: added `openmp=(true,false)` option; can be toggled
when building higan/bsnes
- defaults to disabled on macOS, because Xcode doesn't stupidly
doesn't ship with support for it
- higan/GNUmakefile: forgot to switch target,profile back from
bsnes,fast to higan,accurate
- this is just gonna happen from time to time, sorry
- sfc/dsp: when using the fast profile, the DSP syncs per sample
instead of per clock
- should only negatively impact Koushien 2, but is a fairly
significant speedup otherwise
- sfc/ppc,ppu-fast: optimized the code a bit (ppu 130fps to 133fps)
- sfc/ppu-fast: basic vertical mosaic support (not accurate, but
should look okay hopefully)
- sfc/ppu-fast: added missing mode7 hflip support
- sfc/ppu-fast: added support to render at 256-width and/or 240-height
- gives a decent speed boost, and also allows all of the older
quark shaders to work nicely again
- it does violate the contract of Emulator::Interface, but oh
well, it works fine in the bsnes GUI
- sfc/ppu-fast: use cached CGRAM values for mode7 and sprites
- sfc/ppu-fast: use global range/time over flags in object rendering
- may not actually work as we intended since it's a race condition
even if it's only ORing the flags
- really don't want to have to make those variables atomic if I
don't have to
- sfc/ppu-fast: should fully support interlace and overscan modes now
- hiro/cocoa: updated macOS Gatekeeper disable support to work on
10.13+
- ruby: forgot to fix macOS input driver, sorry
- nall/GNUmakefile: if uname is present, then just default to rm
instead of del (fixes Msys)
Note: blur emulation option will break pretty badly in 256x240 output
mode. I'll fix it later.
2018-05-31 07:06:55 +00:00
|
|
|
uint paletteSize = 2 << io.mode;
|
|
|
|
uint tileMask = ppu.vram.mask >> 3 + io.mode;
|
|
|
|
uint tiledataIndex = io.tiledataAddress >> 3 + io.mode;
|
2010-08-09 13:28:56 +00:00
|
|
|
|
Update to v106r33 release.
byuu says:
Changelog:
- nall/GNUmakefile: added `openmp=(true,false)` option; can be toggled
when building higan/bsnes
- defaults to disabled on macOS, because Xcode doesn't stupidly
doesn't ship with support for it
- higan/GNUmakefile: forgot to switch target,profile back from
bsnes,fast to higan,accurate
- this is just gonna happen from time to time, sorry
- sfc/dsp: when using the fast profile, the DSP syncs per sample
instead of per clock
- should only negatively impact Koushien 2, but is a fairly
significant speedup otherwise
- sfc/ppc,ppu-fast: optimized the code a bit (ppu 130fps to 133fps)
- sfc/ppu-fast: basic vertical mosaic support (not accurate, but
should look okay hopefully)
- sfc/ppu-fast: added missing mode7 hflip support
- sfc/ppu-fast: added support to render at 256-width and/or 240-height
- gives a decent speed boost, and also allows all of the older
quark shaders to work nicely again
- it does violate the contract of Emulator::Interface, but oh
well, it works fine in the bsnes GUI
- sfc/ppu-fast: use cached CGRAM values for mode7 and sprites
- sfc/ppu-fast: use global range/time over flags in object rendering
- may not actually work as we intended since it's a race condition
even if it's only ORing the flags
- really don't want to have to make those variables atomic if I
don't have to
- sfc/ppu-fast: should fully support interlace and overscan modes now
- hiro/cocoa: updated macOS Gatekeeper disable support to work on
10.13+
- ruby: forgot to fix macOS input driver, sorry
- nall/GNUmakefile: if uname is present, then just default to rm
instead of del (fixes Msys)
Note: blur emulation option will break pretty badly in 256x240 output
mode. I'll fix it later.
2018-05-31 07:06:55 +00:00
|
|
|
uint tileHeight = 3 + io.tileSize;
|
2017-09-05 00:56:52 +00:00
|
|
|
uint tileWidth = !hires() ? tileHeight : 4;
|
2010-08-09 13:28:56 +00:00
|
|
|
|
2017-09-05 00:56:52 +00:00
|
|
|
uint width = 256 << hires();
|
Update to v068r12 release.
(there was no r11 release posted to the WIP thread)
byuu says:
This took ten hours of mind boggling insanity to pull off.
It upgrades the S-PPU dot-based renderer to fetch one tile, and then
output all of its pixels before fetching again. It sounds easy enough,
but it's insanely difficult. I ended up taking one small shortcut, in
that rather than fetch at -7, I fetch at the first instance where a tile
is needed to plot to x=0. So if you have {-3 to +4 } as a tile, it
fetches at -3. That won't work so well on hardware, if two BGs fetch at
the same X offset, they won't have time.
I have had no luck staggering the reads at BG1=-7, BG3=-5, etc. While
I can shift and fetch just fine, what happens is that when a new tile is
fetched in, that gives a new palette, priority, etc; and this ends up
happening between two tiles which results in the right-most edges of the
screen ending up with the wrong colors and such.
Offset-per-tile is cheap as always. Although looking at it, I'm not sure
how BG3 could pre-fetch, especially with the way one or two OPT modes
can fetch two tiles.
There's no magic in Hoffset caching yet, so the SMW1 pixel issue is
still there.
Mode 7 got a bugfix, it was off-by-one horizontally from the mosaic
code. After re-designing the BG mosaic, I ended up needing a separate
mosaic for Mode7, and in the process I fixed that bug. The obvious
change is that the Chrono Trigger Mode7->Mode2 transition doesn't cause
the pendulum to jump anymore.
Windows were simplified just a tad. The range testing is shared for all
modes now. Ironically, it's a bit slower, but I'll take less code over
more speed for the accuracy core.
Speaking of speed, because there's so much less calculations per pixel
for BGs, performance for the entire emulator has gone up by 30% in the
accuracy core. Pretty neat overall, I can maintain 60fps in all but,
yeah you can guess can't you?
2010-09-04 03:36:03 +00:00
|
|
|
|
Update to v106r33 release.
byuu says:
Changelog:
- nall/GNUmakefile: added `openmp=(true,false)` option; can be toggled
when building higan/bsnes
- defaults to disabled on macOS, because Xcode doesn't stupidly
doesn't ship with support for it
- higan/GNUmakefile: forgot to switch target,profile back from
bsnes,fast to higan,accurate
- this is just gonna happen from time to time, sorry
- sfc/dsp: when using the fast profile, the DSP syncs per sample
instead of per clock
- should only negatively impact Koushien 2, but is a fairly
significant speedup otherwise
- sfc/ppc,ppu-fast: optimized the code a bit (ppu 130fps to 133fps)
- sfc/ppu-fast: basic vertical mosaic support (not accurate, but
should look okay hopefully)
- sfc/ppu-fast: added missing mode7 hflip support
- sfc/ppu-fast: added support to render at 256-width and/or 240-height
- gives a decent speed boost, and also allows all of the older
quark shaders to work nicely again
- it does violate the contract of Emulator::Interface, but oh
well, it works fine in the bsnes GUI
- sfc/ppu-fast: use cached CGRAM values for mode7 and sprites
- sfc/ppu-fast: use global range/time over flags in object rendering
- may not actually work as we intended since it's a race condition
even if it's only ORing the flags
- really don't want to have to make those variables atomic if I
don't have to
- sfc/ppu-fast: should fully support interlace and overscan modes now
- hiro/cocoa: updated macOS Gatekeeper disable support to work on
10.13+
- ruby: forgot to fix macOS input driver, sorry
- nall/GNUmakefile: if uname is present, then just default to rm
instead of del (fixes Msys)
Note: blur emulation option will break pretty badly in 256x240 output
mode. I'll fix it later.
2018-05-31 07:06:55 +00:00
|
|
|
uint hmask = (width << io.tileSize << io.screenSize.bit(0)) - 1;
|
|
|
|
uint vmask = (width << io.tileSize << io.screenSize.bit(1)) - 1;
|
2010-08-09 13:28:56 +00:00
|
|
|
|
2017-09-05 00:56:52 +00:00
|
|
|
uint px = x << hires();
|
|
|
|
uint py = mosaic.enable ? (uint)mosaic.voffset : y;
|
Update to v068r12 release.
(there was no r11 release posted to the WIP thread)
byuu says:
This took ten hours of mind boggling insanity to pull off.
It upgrades the S-PPU dot-based renderer to fetch one tile, and then
output all of its pixels before fetching again. It sounds easy enough,
but it's insanely difficult. I ended up taking one small shortcut, in
that rather than fetch at -7, I fetch at the first instance where a tile
is needed to plot to x=0. So if you have {-3 to +4 } as a tile, it
fetches at -3. That won't work so well on hardware, if two BGs fetch at
the same X offset, they won't have time.
I have had no luck staggering the reads at BG1=-7, BG3=-5, etc. While
I can shift and fetch just fine, what happens is that when a new tile is
fetched in, that gives a new palette, priority, etc; and this ends up
happening between two tiles which results in the right-most edges of the
screen ending up with the wrong colors and such.
Offset-per-tile is cheap as always. Although looking at it, I'm not sure
how BG3 could pre-fetch, especially with the way one or two OPT modes
can fetch two tiles.
There's no magic in Hoffset caching yet, so the SMW1 pixel issue is
still there.
Mode 7 got a bugfix, it was off-by-one horizontally from the mosaic
code. After re-designing the BG mosaic, I ended up needing a separate
mosaic for Mode7, and in the process I fixed that bug. The obvious
change is that the Chrono Trigger Mode7->Mode2 transition doesn't cause
the pendulum to jump anymore.
Windows were simplified just a tad. The range testing is shared for all
modes now. Ironically, it's a bit slower, but I'll take less code over
more speed for the accuracy core.
Speaking of speed, because there's so much less calculations per pixel
for BGs, performance for the entire emulator has gone up by 30% in the
accuracy core. Pretty neat overall, I can maintain 60fps in all but,
yeah you can guess can't you?
2010-09-04 03:36:03 +00:00
|
|
|
|
2016-06-14 10:51:54 +00:00
|
|
|
uint hscroll = hoffset();
|
|
|
|
uint vscroll = voffset();
|
2017-09-05 00:56:52 +00:00
|
|
|
if(hires()) {
|
2010-08-09 13:28:56 +00:00
|
|
|
hscroll <<= 1;
|
2017-09-05 00:56:52 +00:00
|
|
|
if(ppu.io.interlace) py = py << 1 | (ppu.field() & !mosaic.enable); //todo: temporary vmosaic hack
|
2010-08-09 13:28:56 +00:00
|
|
|
}
|
|
|
|
|
2016-06-14 10:51:54 +00:00
|
|
|
uint hoffset = hscroll + px;
|
|
|
|
uint voffset = vscroll + py;
|
2010-08-09 13:28:56 +00:00
|
|
|
|
Update to v099r14 release.
byuu says:
Changelog:
- (u)int(max,ptr) abbreviations removed; use _t suffix now [didn't feel
like they were contributing enough to be worth it]
- cleaned up nall::integer,natural,real functionality
- toInteger, toNatural, toReal for parsing strings to numbers
- fromInteger, fromNatural, fromReal for creating strings from numbers
- (string,Markup::Node,SQL-based-classes)::(integer,natural,real)
left unchanged
- template<typename T> numeral(T value, long padding, char padchar)
-> string for print() formatting
- deduces integer,natural,real based on T ... cast the value if you
want to override
- there still exists binary,octal,hex,pointer for explicit print()
formatting
- lstring -> string_vector [but using lstring = string_vector; is
declared]
- would be nice to remove the using lstring eventually ... but that'd
probably require 10,000 lines of changes >_>
- format -> string_format [no using here; format was too ambiguous]
- using integer = Integer<sizeof(int)*8>; and using natural =
Natural<sizeof(uint)*8>; declared
- for consistency with boolean. These three are meant for creating
zero-initialized values implicitly (various uses)
- R65816::io() -> idle() and SPC700::io() -> idle() [more clear; frees
up struct IO {} io; naming]
- SFC CPU, PPU, SMP use struct IO {} io; over struct (Status,Registers) {}
(status,registers); now
- still some CPU::Status status values ... they didn't really fit into
IO functionality ... will have to think about this more
- SFC CPU, PPU, SMP now use step() exclusively instead of addClocks()
calling into step()
- SFC CPU joypad1_bits, joypad2_bits were unused; killed them
- SFC PPU CGRAM moved into PPU::Screen; since nothing else uses it
- SFC PPU OAM moved into PPU::Object; since nothing else uses it
- the raw uint8[544] array is gone. OAM::read() constructs values from
the OAM::Object[512] table now
- this avoids having to determine how we want to sub-divide the two
OAM memory sections
- this also eliminates the OAM::synchronize() functionality
- probably more I'm forgetting
The FPS fluctuations are driving me insane. This WIP went from 128fps to
137fps. Settled on 133.5fps for the final build. But nothing I changed
should have affected performance at all. This level of fluctuation makes
it damn near impossible to know whether I'm speeding things up or slowing
things down with changes.
2016-07-01 11:50:32 +00:00
|
|
|
if(ppu.io.bgMode == 2 || ppu.io.bgMode == 4 || ppu.io.bgMode == 6) {
|
2017-09-06 02:38:00 +00:00
|
|
|
uint16 offsetX = px + (hscroll & 7);
|
2010-08-09 13:28:56 +00:00
|
|
|
|
2016-06-14 10:51:54 +00:00
|
|
|
if(offsetX >= 8) {
|
Update to v106r33 release.
byuu says:
Changelog:
- nall/GNUmakefile: added `openmp=(true,false)` option; can be toggled
when building higan/bsnes
- defaults to disabled on macOS, because Xcode doesn't stupidly
doesn't ship with support for it
- higan/GNUmakefile: forgot to switch target,profile back from
bsnes,fast to higan,accurate
- this is just gonna happen from time to time, sorry
- sfc/dsp: when using the fast profile, the DSP syncs per sample
instead of per clock
- should only negatively impact Koushien 2, but is a fairly
significant speedup otherwise
- sfc/ppc,ppu-fast: optimized the code a bit (ppu 130fps to 133fps)
- sfc/ppu-fast: basic vertical mosaic support (not accurate, but
should look okay hopefully)
- sfc/ppu-fast: added missing mode7 hflip support
- sfc/ppu-fast: added support to render at 256-width and/or 240-height
- gives a decent speed boost, and also allows all of the older
quark shaders to work nicely again
- it does violate the contract of Emulator::Interface, but oh
well, it works fine in the bsnes GUI
- sfc/ppu-fast: use cached CGRAM values for mode7 and sprites
- sfc/ppu-fast: use global range/time over flags in object rendering
- may not actually work as we intended since it's a race condition
even if it's only ORing the flags
- really don't want to have to make those variables atomic if I
don't have to
- sfc/ppu-fast: should fully support interlace and overscan modes now
- hiro/cocoa: updated macOS Gatekeeper disable support to work on
10.13+
- ruby: forgot to fix macOS input driver, sorry
- nall/GNUmakefile: if uname is present, then just default to rm
instead of del (fixes Msys)
Note: blur emulation option will break pretty badly in 256x240 output
mode. I'll fix it later.
2018-05-31 07:06:55 +00:00
|
|
|
auto hlookup = ppu.bg3.getTile((offsetX - 8) + (ppu.bg3.hoffset() & ~7), ppu.bg3.voffset() + 0);
|
|
|
|
auto vlookup = ppu.bg3.getTile((offsetX - 8) + (ppu.bg3.hoffset() & ~7), ppu.bg3.voffset() + 8);
|
|
|
|
uint valid = 13 + id;
|
2010-08-09 13:28:56 +00:00
|
|
|
|
Update to v099r14 release.
byuu says:
Changelog:
- (u)int(max,ptr) abbreviations removed; use _t suffix now [didn't feel
like they were contributing enough to be worth it]
- cleaned up nall::integer,natural,real functionality
- toInteger, toNatural, toReal for parsing strings to numbers
- fromInteger, fromNatural, fromReal for creating strings from numbers
- (string,Markup::Node,SQL-based-classes)::(integer,natural,real)
left unchanged
- template<typename T> numeral(T value, long padding, char padchar)
-> string for print() formatting
- deduces integer,natural,real based on T ... cast the value if you
want to override
- there still exists binary,octal,hex,pointer for explicit print()
formatting
- lstring -> string_vector [but using lstring = string_vector; is
declared]
- would be nice to remove the using lstring eventually ... but that'd
probably require 10,000 lines of changes >_>
- format -> string_format [no using here; format was too ambiguous]
- using integer = Integer<sizeof(int)*8>; and using natural =
Natural<sizeof(uint)*8>; declared
- for consistency with boolean. These three are meant for creating
zero-initialized values implicitly (various uses)
- R65816::io() -> idle() and SPC700::io() -> idle() [more clear; frees
up struct IO {} io; naming]
- SFC CPU, PPU, SMP use struct IO {} io; over struct (Status,Registers) {}
(status,registers); now
- still some CPU::Status status values ... they didn't really fit into
IO functionality ... will have to think about this more
- SFC CPU, PPU, SMP now use step() exclusively instead of addClocks()
calling into step()
- SFC CPU joypad1_bits, joypad2_bits were unused; killed them
- SFC PPU CGRAM moved into PPU::Screen; since nothing else uses it
- SFC PPU OAM moved into PPU::Object; since nothing else uses it
- the raw uint8[544] array is gone. OAM::read() constructs values from
the OAM::Object[512] table now
- this avoids having to determine how we want to sub-divide the two
OAM memory sections
- this also eliminates the OAM::synchronize() functionality
- probably more I'm forgetting
The FPS fluctuations are driving me insane. This WIP went from 128fps to
137fps. Settled on 133.5fps for the final build. But nothing I changed
should have affected performance at all. This level of fluctuation makes
it damn near impossible to know whether I'm speeding things up or slowing
things down with changes.
2016-07-01 11:50:32 +00:00
|
|
|
if(ppu.io.bgMode == 4) {
|
Update to v106r33 release.
byuu says:
Changelog:
- nall/GNUmakefile: added `openmp=(true,false)` option; can be toggled
when building higan/bsnes
- defaults to disabled on macOS, because Xcode doesn't stupidly
doesn't ship with support for it
- higan/GNUmakefile: forgot to switch target,profile back from
bsnes,fast to higan,accurate
- this is just gonna happen from time to time, sorry
- sfc/dsp: when using the fast profile, the DSP syncs per sample
instead of per clock
- should only negatively impact Koushien 2, but is a fairly
significant speedup otherwise
- sfc/ppc,ppu-fast: optimized the code a bit (ppu 130fps to 133fps)
- sfc/ppu-fast: basic vertical mosaic support (not accurate, but
should look okay hopefully)
- sfc/ppu-fast: added missing mode7 hflip support
- sfc/ppu-fast: added support to render at 256-width and/or 240-height
- gives a decent speed boost, and also allows all of the older
quark shaders to work nicely again
- it does violate the contract of Emulator::Interface, but oh
well, it works fine in the bsnes GUI
- sfc/ppu-fast: use cached CGRAM values for mode7 and sprites
- sfc/ppu-fast: use global range/time over flags in object rendering
- may not actually work as we intended since it's a race condition
even if it's only ORing the flags
- really don't want to have to make those variables atomic if I
don't have to
- sfc/ppu-fast: should fully support interlace and overscan modes now
- hiro/cocoa: updated macOS Gatekeeper disable support to work on
10.13+
- ruby: forgot to fix macOS input driver, sorry
- nall/GNUmakefile: if uname is present, then just default to rm
instead of del (fixes Msys)
Note: blur emulation option will break pretty badly in 256x240 output
mode. I'll fix it later.
2018-05-31 07:06:55 +00:00
|
|
|
if(hlookup.bit(valid)) {
|
|
|
|
if(!hlookup.bit(15)) {
|
|
|
|
hoffset = offsetX + (hlookup & ~7);
|
2010-08-09 13:28:56 +00:00
|
|
|
} else {
|
Update to v106r33 release.
byuu says:
Changelog:
- nall/GNUmakefile: added `openmp=(true,false)` option; can be toggled
when building higan/bsnes
- defaults to disabled on macOS, because Xcode doesn't stupidly
doesn't ship with support for it
- higan/GNUmakefile: forgot to switch target,profile back from
bsnes,fast to higan,accurate
- this is just gonna happen from time to time, sorry
- sfc/dsp: when using the fast profile, the DSP syncs per sample
instead of per clock
- should only negatively impact Koushien 2, but is a fairly
significant speedup otherwise
- sfc/ppc,ppu-fast: optimized the code a bit (ppu 130fps to 133fps)
- sfc/ppu-fast: basic vertical mosaic support (not accurate, but
should look okay hopefully)
- sfc/ppu-fast: added missing mode7 hflip support
- sfc/ppu-fast: added support to render at 256-width and/or 240-height
- gives a decent speed boost, and also allows all of the older
quark shaders to work nicely again
- it does violate the contract of Emulator::Interface, but oh
well, it works fine in the bsnes GUI
- sfc/ppu-fast: use cached CGRAM values for mode7 and sprites
- sfc/ppu-fast: use global range/time over flags in object rendering
- may not actually work as we intended since it's a race condition
even if it's only ORing the flags
- really don't want to have to make those variables atomic if I
don't have to
- sfc/ppu-fast: should fully support interlace and overscan modes now
- hiro/cocoa: updated macOS Gatekeeper disable support to work on
10.13+
- ruby: forgot to fix macOS input driver, sorry
- nall/GNUmakefile: if uname is present, then just default to rm
instead of del (fixes Msys)
Note: blur emulation option will break pretty badly in 256x240 output
mode. I'll fix it later.
2018-05-31 07:06:55 +00:00
|
|
|
voffset = py + hlookup;
|
2010-08-09 13:28:56 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
} else {
|
Update to v106r33 release.
byuu says:
Changelog:
- nall/GNUmakefile: added `openmp=(true,false)` option; can be toggled
when building higan/bsnes
- defaults to disabled on macOS, because Xcode doesn't stupidly
doesn't ship with support for it
- higan/GNUmakefile: forgot to switch target,profile back from
bsnes,fast to higan,accurate
- this is just gonna happen from time to time, sorry
- sfc/dsp: when using the fast profile, the DSP syncs per sample
instead of per clock
- should only negatively impact Koushien 2, but is a fairly
significant speedup otherwise
- sfc/ppc,ppu-fast: optimized the code a bit (ppu 130fps to 133fps)
- sfc/ppu-fast: basic vertical mosaic support (not accurate, but
should look okay hopefully)
- sfc/ppu-fast: added missing mode7 hflip support
- sfc/ppu-fast: added support to render at 256-width and/or 240-height
- gives a decent speed boost, and also allows all of the older
quark shaders to work nicely again
- it does violate the contract of Emulator::Interface, but oh
well, it works fine in the bsnes GUI
- sfc/ppu-fast: use cached CGRAM values for mode7 and sprites
- sfc/ppu-fast: use global range/time over flags in object rendering
- may not actually work as we intended since it's a race condition
even if it's only ORing the flags
- really don't want to have to make those variables atomic if I
don't have to
- sfc/ppu-fast: should fully support interlace and overscan modes now
- hiro/cocoa: updated macOS Gatekeeper disable support to work on
10.13+
- ruby: forgot to fix macOS input driver, sorry
- nall/GNUmakefile: if uname is present, then just default to rm
instead of del (fixes Msys)
Note: blur emulation option will break pretty badly in 256x240 output
mode. I'll fix it later.
2018-05-31 07:06:55 +00:00
|
|
|
if(hlookup.bit(valid)) hoffset = offsetX + (hlookup & ~7);
|
|
|
|
if(vlookup.bit(valid)) voffset = py + vlookup;
|
2010-08-09 13:28:56 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2011-06-05 03:25:24 +00:00
|
|
|
hoffset &= hmask;
|
|
|
|
voffset &= vmask;
|
2010-08-09 13:28:56 +00:00
|
|
|
|
Update to v106r33 release.
byuu says:
Changelog:
- nall/GNUmakefile: added `openmp=(true,false)` option; can be toggled
when building higan/bsnes
- defaults to disabled on macOS, because Xcode doesn't stupidly
doesn't ship with support for it
- higan/GNUmakefile: forgot to switch target,profile back from
bsnes,fast to higan,accurate
- this is just gonna happen from time to time, sorry
- sfc/dsp: when using the fast profile, the DSP syncs per sample
instead of per clock
- should only negatively impact Koushien 2, but is a fairly
significant speedup otherwise
- sfc/ppc,ppu-fast: optimized the code a bit (ppu 130fps to 133fps)
- sfc/ppu-fast: basic vertical mosaic support (not accurate, but
should look okay hopefully)
- sfc/ppu-fast: added missing mode7 hflip support
- sfc/ppu-fast: added support to render at 256-width and/or 240-height
- gives a decent speed boost, and also allows all of the older
quark shaders to work nicely again
- it does violate the contract of Emulator::Interface, but oh
well, it works fine in the bsnes GUI
- sfc/ppu-fast: use cached CGRAM values for mode7 and sprites
- sfc/ppu-fast: use global range/time over flags in object rendering
- may not actually work as we intended since it's a race condition
even if it's only ORing the flags
- really don't want to have to make those variables atomic if I
don't have to
- sfc/ppu-fast: should fully support interlace and overscan modes now
- hiro/cocoa: updated macOS Gatekeeper disable support to work on
10.13+
- ruby: forgot to fix macOS input driver, sorry
- nall/GNUmakefile: if uname is present, then just default to rm
instead of del (fixes Msys)
Note: blur emulation option will break pretty badly in 256x240 output
mode. I'll fix it later.
2018-05-31 07:06:55 +00:00
|
|
|
uint screenX = io.screenSize.bit(0) ? 32 << 5 : 0;
|
|
|
|
uint screenY = io.screenSize.bit(1) ? 32 << 5 + io.screenSize.bit(0) : 0;
|
Update to v068r12 release.
(there was no r11 release posted to the WIP thread)
byuu says:
This took ten hours of mind boggling insanity to pull off.
It upgrades the S-PPU dot-based renderer to fetch one tile, and then
output all of its pixels before fetching again. It sounds easy enough,
but it's insanely difficult. I ended up taking one small shortcut, in
that rather than fetch at -7, I fetch at the first instance where a tile
is needed to plot to x=0. So if you have {-3 to +4 } as a tile, it
fetches at -3. That won't work so well on hardware, if two BGs fetch at
the same X offset, they won't have time.
I have had no luck staggering the reads at BG1=-7, BG3=-5, etc. While
I can shift and fetch just fine, what happens is that when a new tile is
fetched in, that gives a new palette, priority, etc; and this ends up
happening between two tiles which results in the right-most edges of the
screen ending up with the wrong colors and such.
Offset-per-tile is cheap as always. Although looking at it, I'm not sure
how BG3 could pre-fetch, especially with the way one or two OPT modes
can fetch two tiles.
There's no magic in Hoffset caching yet, so the SMW1 pixel issue is
still there.
Mode 7 got a bugfix, it was off-by-one horizontally from the mosaic
code. After re-designing the BG mosaic, I ended up needing a separate
mosaic for Mode7, and in the process I fixed that bug. The obvious
change is that the Chrono Trigger Mode7->Mode2 transition doesn't cause
the pendulum to jump anymore.
Windows were simplified just a tad. The range testing is shared for all
modes now. Ironically, it's a bit slower, but I'll take less code over
more speed for the accuracy core.
Speaking of speed, because there's so much less calculations per pixel
for BGs, performance for the entire emulator has gone up by 30% in the
accuracy core. Pretty neat overall, I can maintain 60fps in all but,
yeah you can guess can't you?
2010-09-04 03:36:03 +00:00
|
|
|
|
Update to v106r33 release.
byuu says:
Changelog:
- nall/GNUmakefile: added `openmp=(true,false)` option; can be toggled
when building higan/bsnes
- defaults to disabled on macOS, because Xcode doesn't stupidly
doesn't ship with support for it
- higan/GNUmakefile: forgot to switch target,profile back from
bsnes,fast to higan,accurate
- this is just gonna happen from time to time, sorry
- sfc/dsp: when using the fast profile, the DSP syncs per sample
instead of per clock
- should only negatively impact Koushien 2, but is a fairly
significant speedup otherwise
- sfc/ppc,ppu-fast: optimized the code a bit (ppu 130fps to 133fps)
- sfc/ppu-fast: basic vertical mosaic support (not accurate, but
should look okay hopefully)
- sfc/ppu-fast: added missing mode7 hflip support
- sfc/ppu-fast: added support to render at 256-width and/or 240-height
- gives a decent speed boost, and also allows all of the older
quark shaders to work nicely again
- it does violate the contract of Emulator::Interface, but oh
well, it works fine in the bsnes GUI
- sfc/ppu-fast: use cached CGRAM values for mode7 and sprites
- sfc/ppu-fast: use global range/time over flags in object rendering
- may not actually work as we intended since it's a race condition
even if it's only ORing the flags
- really don't want to have to make those variables atomic if I
don't have to
- sfc/ppu-fast: should fully support interlace and overscan modes now
- hiro/cocoa: updated macOS Gatekeeper disable support to work on
10.13+
- ruby: forgot to fix macOS input driver, sorry
- nall/GNUmakefile: if uname is present, then just default to rm
instead of del (fixes Msys)
Note: blur emulation option will break pretty badly in 256x240 output
mode. I'll fix it later.
2018-05-31 07:06:55 +00:00
|
|
|
uint tileX = hoffset >> tileWidth;
|
|
|
|
uint tileY = voffset >> tileHeight;
|
Update to v068r12 release.
(there was no r11 release posted to the WIP thread)
byuu says:
This took ten hours of mind boggling insanity to pull off.
It upgrades the S-PPU dot-based renderer to fetch one tile, and then
output all of its pixels before fetching again. It sounds easy enough,
but it's insanely difficult. I ended up taking one small shortcut, in
that rather than fetch at -7, I fetch at the first instance where a tile
is needed to plot to x=0. So if you have {-3 to +4 } as a tile, it
fetches at -3. That won't work so well on hardware, if two BGs fetch at
the same X offset, they won't have time.
I have had no luck staggering the reads at BG1=-7, BG3=-5, etc. While
I can shift and fetch just fine, what happens is that when a new tile is
fetched in, that gives a new palette, priority, etc; and this ends up
happening between two tiles which results in the right-most edges of the
screen ending up with the wrong colors and such.
Offset-per-tile is cheap as always. Although looking at it, I'm not sure
how BG3 could pre-fetch, especially with the way one or two OPT modes
can fetch two tiles.
There's no magic in Hoffset caching yet, so the SMW1 pixel issue is
still there.
Mode 7 got a bugfix, it was off-by-one horizontally from the mosaic
code. After re-designing the BG mosaic, I ended up needing a separate
mosaic for Mode7, and in the process I fixed that bug. The obvious
change is that the Chrono Trigger Mode7->Mode2 transition doesn't cause
the pendulum to jump anymore.
Windows were simplified just a tad. The range testing is shared for all
modes now. Ironically, it's a bit slower, but I'll take less code over
more speed for the accuracy core.
Speaking of speed, because there's so much less calculations per pixel
for BGs, performance for the entire emulator has gone up by 30% in the
accuracy core. Pretty neat overall, I can maintain 60fps in all but,
yeah you can guess can't you?
2010-09-04 03:36:03 +00:00
|
|
|
|
Update to v106r33 release.
byuu says:
Changelog:
- nall/GNUmakefile: added `openmp=(true,false)` option; can be toggled
when building higan/bsnes
- defaults to disabled on macOS, because Xcode doesn't stupidly
doesn't ship with support for it
- higan/GNUmakefile: forgot to switch target,profile back from
bsnes,fast to higan,accurate
- this is just gonna happen from time to time, sorry
- sfc/dsp: when using the fast profile, the DSP syncs per sample
instead of per clock
- should only negatively impact Koushien 2, but is a fairly
significant speedup otherwise
- sfc/ppc,ppu-fast: optimized the code a bit (ppu 130fps to 133fps)
- sfc/ppu-fast: basic vertical mosaic support (not accurate, but
should look okay hopefully)
- sfc/ppu-fast: added missing mode7 hflip support
- sfc/ppu-fast: added support to render at 256-width and/or 240-height
- gives a decent speed boost, and also allows all of the older
quark shaders to work nicely again
- it does violate the contract of Emulator::Interface, but oh
well, it works fine in the bsnes GUI
- sfc/ppu-fast: use cached CGRAM values for mode7 and sprites
- sfc/ppu-fast: use global range/time over flags in object rendering
- may not actually work as we intended since it's a race condition
even if it's only ORing the flags
- really don't want to have to make those variables atomic if I
don't have to
- sfc/ppu-fast: should fully support interlace and overscan modes now
- hiro/cocoa: updated macOS Gatekeeper disable support to work on
10.13+
- ruby: forgot to fix macOS input driver, sorry
- nall/GNUmakefile: if uname is present, then just default to rm
instead of del (fixes Msys)
Note: blur emulation option will break pretty badly in 256x240 output
mode. I'll fix it later.
2018-05-31 07:06:55 +00:00
|
|
|
uint16 offset = (tileY & 0x1f) << 5 | (tileX & 0x1f);
|
|
|
|
if(tileX & 0x20) offset += screenX;
|
|
|
|
if(tileY & 0x20) offset += screenY;
|
2010-08-09 13:28:56 +00:00
|
|
|
|
Update to v099r14 release.
byuu says:
Changelog:
- (u)int(max,ptr) abbreviations removed; use _t suffix now [didn't feel
like they were contributing enough to be worth it]
- cleaned up nall::integer,natural,real functionality
- toInteger, toNatural, toReal for parsing strings to numbers
- fromInteger, fromNatural, fromReal for creating strings from numbers
- (string,Markup::Node,SQL-based-classes)::(integer,natural,real)
left unchanged
- template<typename T> numeral(T value, long padding, char padchar)
-> string for print() formatting
- deduces integer,natural,real based on T ... cast the value if you
want to override
- there still exists binary,octal,hex,pointer for explicit print()
formatting
- lstring -> string_vector [but using lstring = string_vector; is
declared]
- would be nice to remove the using lstring eventually ... but that'd
probably require 10,000 lines of changes >_>
- format -> string_format [no using here; format was too ambiguous]
- using integer = Integer<sizeof(int)*8>; and using natural =
Natural<sizeof(uint)*8>; declared
- for consistency with boolean. These three are meant for creating
zero-initialized values implicitly (various uses)
- R65816::io() -> idle() and SPC700::io() -> idle() [more clear; frees
up struct IO {} io; naming]
- SFC CPU, PPU, SMP use struct IO {} io; over struct (Status,Registers) {}
(status,registers); now
- still some CPU::Status status values ... they didn't really fit into
IO functionality ... will have to think about this more
- SFC CPU, PPU, SMP now use step() exclusively instead of addClocks()
calling into step()
- SFC CPU joypad1_bits, joypad2_bits were unused; killed them
- SFC PPU CGRAM moved into PPU::Screen; since nothing else uses it
- SFC PPU OAM moved into PPU::Object; since nothing else uses it
- the raw uint8[544] array is gone. OAM::read() constructs values from
the OAM::Object[512] table now
- this avoids having to determine how we want to sub-divide the two
OAM memory sections
- this also eliminates the OAM::synchronize() functionality
- probably more I'm forgetting
The FPS fluctuations are driving me insane. This WIP went from 128fps to
137fps. Settled on 133.5fps for the final build. But nothing I changed
should have affected performance at all. This level of fluctuation makes
it damn near impossible to know whether I'm speeding things up or slowing
things down with changes.
2016-07-01 11:50:32 +00:00
|
|
|
uint16 address = io.screenAddress + offset;
|
Update to v099r07 release.
byuu says:
Changelog:
- (hopefully) fixed BS Memory and Sufami Turbo slot loading
- ported GB, GBA, WS cores to use nall/vfs
- completely removed loadRequest, saveRequest functionality from
Emulator::Interface and ui-tomoko
- loadRequest(folder) is now load(folder)
- save states now use a shared Emulator::SerializerVersion string
- whenever this is bumped, all older states will break; but this makes
bumping state versions way easier
- also, the version string makes it a lot easier to identify
compatibility windows for save states
- SNES PPU now uses uint16 vram[32768] for memory accesses [hex_usr]
NOTE: Super Game Boy loading is currently broken, and I'm not entirely
sure how to fix it :/
The file loading handoff was -really- complicated, and so I'm kind of
at a loss ... so for now, don't try it.
Everything else should theoretically work, so please report any bugs
you find.
So, this is pretty much it. I'd be very curious to hear feedback from
people who objected to the old nall/stream design, whether they are
happy with the new file loading system or think it could use further
improvements.
The 16-bit VRAM turned out to be a wash on performance (roughly the same
as before. 1fps slower on Zelda 3, 1fps faster on Yoshi's Island.) The
main reason for this was because Yoshi's Island was breaking horribly
until I changed the vramRead, vramWrite functions to take uint15 instead
of uint16.
I suspect the issue is we're using uint16s in some areas now that need
to be uint15, and this game is setting the VRAM address to 0x8000+,
causing us to go out of bounds on memory accesses.
But ... I want to go ahead and do something cute for fun, and just because
we can ... and this new interface is so incredibly perfect for it!! I
want to support an SNES unit with 128KiB of VRAM. Not out of the box,
but as a fun little tweakable thing. The SNES was clearly designed to
support that, they just didn't use big enough VRAM chips, and left one
of the lines disconnected. So ... let's connect it anyway!
In the end, if we design it right, the only code difference should be
one area where we mask by 15-bits instead of by 16-bits.
2016-06-24 12:09:30 +00:00
|
|
|
tile = ppu.vram[address];
|
2017-09-05 00:56:52 +00:00
|
|
|
bool mirrorY = tile.bit(15);
|
|
|
|
bool mirrorX = tile.bit(14);
|
|
|
|
priority = io.priority[tile.bit(13)];
|
|
|
|
paletteNumber = tile.bits(10,12);
|
2016-06-14 10:51:54 +00:00
|
|
|
paletteIndex = paletteOffset + (paletteNumber << paletteSize);
|
Update to v068r12 release.
(there was no r11 release posted to the WIP thread)
byuu says:
This took ten hours of mind boggling insanity to pull off.
It upgrades the S-PPU dot-based renderer to fetch one tile, and then
output all of its pixels before fetching again. It sounds easy enough,
but it's insanely difficult. I ended up taking one small shortcut, in
that rather than fetch at -7, I fetch at the first instance where a tile
is needed to plot to x=0. So if you have {-3 to +4 } as a tile, it
fetches at -3. That won't work so well on hardware, if two BGs fetch at
the same X offset, they won't have time.
I have had no luck staggering the reads at BG1=-7, BG3=-5, etc. While
I can shift and fetch just fine, what happens is that when a new tile is
fetched in, that gives a new palette, priority, etc; and this ends up
happening between two tiles which results in the right-most edges of the
screen ending up with the wrong colors and such.
Offset-per-tile is cheap as always. Although looking at it, I'm not sure
how BG3 could pre-fetch, especially with the way one or two OPT modes
can fetch two tiles.
There's no magic in Hoffset caching yet, so the SMW1 pixel issue is
still there.
Mode 7 got a bugfix, it was off-by-one horizontally from the mosaic
code. After re-designing the BG mosaic, I ended up needing a separate
mosaic for Mode7, and in the process I fixed that bug. The obvious
change is that the Chrono Trigger Mode7->Mode2 transition doesn't cause
the pendulum to jump anymore.
Windows were simplified just a tad. The range testing is shared for all
modes now. Ironically, it's a bit slower, but I'll take less code over
more speed for the accuracy core.
Speaking of speed, because there's so much less calculations per pixel
for BGs, performance for the entire emulator has gone up by 30% in the
accuracy core. Pretty neat overall, I can maintain 60fps in all but,
yeah you can guess can't you?
2010-09-04 03:36:03 +00:00
|
|
|
|
2017-09-05 00:56:52 +00:00
|
|
|
if(tileWidth == 4 && (bool)(hoffset & 8) != mirrorX) tile += 1;
|
|
|
|
if(tileHeight == 4 && (bool)(voffset & 8) != mirrorY) tile += 16;
|
|
|
|
uint16 character = tile.bits(0,9) + tiledataIndex & tileMask;
|
2010-08-09 13:28:56 +00:00
|
|
|
|
2016-06-14 10:51:54 +00:00
|
|
|
if(mirrorY) voffset ^= 7;
|
Update to v106r33 release.
byuu says:
Changelog:
- nall/GNUmakefile: added `openmp=(true,false)` option; can be toggled
when building higan/bsnes
- defaults to disabled on macOS, because Xcode doesn't stupidly
doesn't ship with support for it
- higan/GNUmakefile: forgot to switch target,profile back from
bsnes,fast to higan,accurate
- this is just gonna happen from time to time, sorry
- sfc/dsp: when using the fast profile, the DSP syncs per sample
instead of per clock
- should only negatively impact Koushien 2, but is a fairly
significant speedup otherwise
- sfc/ppc,ppu-fast: optimized the code a bit (ppu 130fps to 133fps)
- sfc/ppu-fast: basic vertical mosaic support (not accurate, but
should look okay hopefully)
- sfc/ppu-fast: added missing mode7 hflip support
- sfc/ppu-fast: added support to render at 256-width and/or 240-height
- gives a decent speed boost, and also allows all of the older
quark shaders to work nicely again
- it does violate the contract of Emulator::Interface, but oh
well, it works fine in the bsnes GUI
- sfc/ppu-fast: use cached CGRAM values for mode7 and sprites
- sfc/ppu-fast: use global range/time over flags in object rendering
- may not actually work as we intended since it's a race condition
even if it's only ORing the flags
- really don't want to have to make those variables atomic if I
don't have to
- sfc/ppu-fast: should fully support interlace and overscan modes now
- hiro/cocoa: updated macOS Gatekeeper disable support to work on
10.13+
- ruby: forgot to fix macOS input driver, sorry
- nall/GNUmakefile: if uname is present, then just default to rm
instead of del (fixes Msys)
Note: blur emulation option will break pretty badly in 256x240 output
mode. I'll fix it later.
2018-05-31 07:06:55 +00:00
|
|
|
offset = (character << 3 + io.mode) + (voffset & 7);
|
2010-08-09 13:28:56 +00:00
|
|
|
|
Update to v099r14 release.
byuu says:
Changelog:
- (u)int(max,ptr) abbreviations removed; use _t suffix now [didn't feel
like they were contributing enough to be worth it]
- cleaned up nall::integer,natural,real functionality
- toInteger, toNatural, toReal for parsing strings to numbers
- fromInteger, fromNatural, fromReal for creating strings from numbers
- (string,Markup::Node,SQL-based-classes)::(integer,natural,real)
left unchanged
- template<typename T> numeral(T value, long padding, char padchar)
-> string for print() formatting
- deduces integer,natural,real based on T ... cast the value if you
want to override
- there still exists binary,octal,hex,pointer for explicit print()
formatting
- lstring -> string_vector [but using lstring = string_vector; is
declared]
- would be nice to remove the using lstring eventually ... but that'd
probably require 10,000 lines of changes >_>
- format -> string_format [no using here; format was too ambiguous]
- using integer = Integer<sizeof(int)*8>; and using natural =
Natural<sizeof(uint)*8>; declared
- for consistency with boolean. These three are meant for creating
zero-initialized values implicitly (various uses)
- R65816::io() -> idle() and SPC700::io() -> idle() [more clear; frees
up struct IO {} io; naming]
- SFC CPU, PPU, SMP use struct IO {} io; over struct (Status,Registers) {}
(status,registers); now
- still some CPU::Status status values ... they didn't really fit into
IO functionality ... will have to think about this more
- SFC CPU, PPU, SMP now use step() exclusively instead of addClocks()
calling into step()
- SFC CPU joypad1_bits, joypad2_bits were unused; killed them
- SFC PPU CGRAM moved into PPU::Screen; since nothing else uses it
- SFC PPU OAM moved into PPU::Object; since nothing else uses it
- the raw uint8[544] array is gone. OAM::read() constructs values from
the OAM::Object[512] table now
- this avoids having to determine how we want to sub-divide the two
OAM memory sections
- this also eliminates the OAM::synchronize() functionality
- probably more I'm forgetting
The FPS fluctuations are driving me insane. This WIP went from 128fps to
137fps. Settled on 133.5fps for the final build. But nothing I changed
should have affected performance at all. This level of fluctuation makes
it damn near impossible to know whether I'm speeding things up or slowing
things down with changes.
2016-07-01 11:50:32 +00:00
|
|
|
switch(io.mode) {
|
2012-02-18 07:49:52 +00:00
|
|
|
case Mode::BPP8:
|
Update to v099r07 release.
byuu says:
Changelog:
- (hopefully) fixed BS Memory and Sufami Turbo slot loading
- ported GB, GBA, WS cores to use nall/vfs
- completely removed loadRequest, saveRequest functionality from
Emulator::Interface and ui-tomoko
- loadRequest(folder) is now load(folder)
- save states now use a shared Emulator::SerializerVersion string
- whenever this is bumped, all older states will break; but this makes
bumping state versions way easier
- also, the version string makes it a lot easier to identify
compatibility windows for save states
- SNES PPU now uses uint16 vram[32768] for memory accesses [hex_usr]
NOTE: Super Game Boy loading is currently broken, and I'm not entirely
sure how to fix it :/
The file loading handoff was -really- complicated, and so I'm kind of
at a loss ... so for now, don't try it.
Everything else should theoretically work, so please report any bugs
you find.
So, this is pretty much it. I'd be very curious to hear feedback from
people who objected to the old nall/stream design, whether they are
happy with the new file loading system or think it could use further
improvements.
The 16-bit VRAM turned out to be a wash on performance (roughly the same
as before. 1fps slower on Zelda 3, 1fps faster on Yoshi's Island.) The
main reason for this was because Yoshi's Island was breaking horribly
until I changed the vramRead, vramWrite functions to take uint15 instead
of uint16.
I suspect the issue is we're using uint16s in some areas now that need
to be uint15, and this game is setting the VRAM address to 0x8000+,
causing us to go out of bounds on memory accesses.
But ... I want to go ahead and do something cute for fun, and just because
we can ... and this new interface is so incredibly perfect for it!! I
want to support an SNES unit with 128KiB of VRAM. Not out of the box,
but as a fun little tweakable thing. The SNES was clearly designed to
support that, they just didn't use big enough VRAM chips, and left one
of the lines disconnected. So ... let's connect it anyway!
In the end, if we design it right, the only code difference should be
one area where we mask by 15-bits instead of by 16-bits.
2016-06-24 12:09:30 +00:00
|
|
|
data[1].bits(16,31) = ppu.vram[offset + 24];
|
|
|
|
data[1].bits( 0,15) = ppu.vram[offset + 16];
|
2012-02-18 07:49:52 +00:00
|
|
|
case Mode::BPP4:
|
Update to v099r07 release.
byuu says:
Changelog:
- (hopefully) fixed BS Memory and Sufami Turbo slot loading
- ported GB, GBA, WS cores to use nall/vfs
- completely removed loadRequest, saveRequest functionality from
Emulator::Interface and ui-tomoko
- loadRequest(folder) is now load(folder)
- save states now use a shared Emulator::SerializerVersion string
- whenever this is bumped, all older states will break; but this makes
bumping state versions way easier
- also, the version string makes it a lot easier to identify
compatibility windows for save states
- SNES PPU now uses uint16 vram[32768] for memory accesses [hex_usr]
NOTE: Super Game Boy loading is currently broken, and I'm not entirely
sure how to fix it :/
The file loading handoff was -really- complicated, and so I'm kind of
at a loss ... so for now, don't try it.
Everything else should theoretically work, so please report any bugs
you find.
So, this is pretty much it. I'd be very curious to hear feedback from
people who objected to the old nall/stream design, whether they are
happy with the new file loading system or think it could use further
improvements.
The 16-bit VRAM turned out to be a wash on performance (roughly the same
as before. 1fps slower on Zelda 3, 1fps faster on Yoshi's Island.) The
main reason for this was because Yoshi's Island was breaking horribly
until I changed the vramRead, vramWrite functions to take uint15 instead
of uint16.
I suspect the issue is we're using uint16s in some areas now that need
to be uint15, and this game is setting the VRAM address to 0x8000+,
causing us to go out of bounds on memory accesses.
But ... I want to go ahead and do something cute for fun, and just because
we can ... and this new interface is so incredibly perfect for it!! I
want to support an SNES unit with 128KiB of VRAM. Not out of the box,
but as a fun little tweakable thing. The SNES was clearly designed to
support that, they just didn't use big enough VRAM chips, and left one
of the lines disconnected. So ... let's connect it anyway!
In the end, if we design it right, the only code difference should be
one area where we mask by 15-bits instead of by 16-bits.
2016-06-24 12:09:30 +00:00
|
|
|
data[0].bits(16,31) = ppu.vram[offset + 8];
|
2012-02-18 07:49:52 +00:00
|
|
|
case Mode::BPP2:
|
Update to v099r07 release.
byuu says:
Changelog:
- (hopefully) fixed BS Memory and Sufami Turbo slot loading
- ported GB, GBA, WS cores to use nall/vfs
- completely removed loadRequest, saveRequest functionality from
Emulator::Interface and ui-tomoko
- loadRequest(folder) is now load(folder)
- save states now use a shared Emulator::SerializerVersion string
- whenever this is bumped, all older states will break; but this makes
bumping state versions way easier
- also, the version string makes it a lot easier to identify
compatibility windows for save states
- SNES PPU now uses uint16 vram[32768] for memory accesses [hex_usr]
NOTE: Super Game Boy loading is currently broken, and I'm not entirely
sure how to fix it :/
The file loading handoff was -really- complicated, and so I'm kind of
at a loss ... so for now, don't try it.
Everything else should theoretically work, so please report any bugs
you find.
So, this is pretty much it. I'd be very curious to hear feedback from
people who objected to the old nall/stream design, whether they are
happy with the new file loading system or think it could use further
improvements.
The 16-bit VRAM turned out to be a wash on performance (roughly the same
as before. 1fps slower on Zelda 3, 1fps faster on Yoshi's Island.) The
main reason for this was because Yoshi's Island was breaking horribly
until I changed the vramRead, vramWrite functions to take uint15 instead
of uint16.
I suspect the issue is we're using uint16s in some areas now that need
to be uint15, and this game is setting the VRAM address to 0x8000+,
causing us to go out of bounds on memory accesses.
But ... I want to go ahead and do something cute for fun, and just because
we can ... and this new interface is so incredibly perfect for it!! I
want to support an SNES unit with 128KiB of VRAM. Not out of the box,
but as a fun little tweakable thing. The SNES was clearly designed to
support that, they just didn't use big enough VRAM chips, and left one
of the lines disconnected. So ... let's connect it anyway!
In the end, if we design it right, the only code difference should be
one area where we mask by 15-bits instead of by 16-bits.
2016-06-24 12:09:30 +00:00
|
|
|
data[0].bits( 0,15) = ppu.vram[offset + 0];
|
Update to v068r12 release.
(there was no r11 release posted to the WIP thread)
byuu says:
This took ten hours of mind boggling insanity to pull off.
It upgrades the S-PPU dot-based renderer to fetch one tile, and then
output all of its pixels before fetching again. It sounds easy enough,
but it's insanely difficult. I ended up taking one small shortcut, in
that rather than fetch at -7, I fetch at the first instance where a tile
is needed to plot to x=0. So if you have {-3 to +4 } as a tile, it
fetches at -3. That won't work so well on hardware, if two BGs fetch at
the same X offset, they won't have time.
I have had no luck staggering the reads at BG1=-7, BG3=-5, etc. While
I can shift and fetch just fine, what happens is that when a new tile is
fetched in, that gives a new palette, priority, etc; and this ends up
happening between two tiles which results in the right-most edges of the
screen ending up with the wrong colors and such.
Offset-per-tile is cheap as always. Although looking at it, I'm not sure
how BG3 could pre-fetch, especially with the way one or two OPT modes
can fetch two tiles.
There's no magic in Hoffset caching yet, so the SMW1 pixel issue is
still there.
Mode 7 got a bugfix, it was off-by-one horizontally from the mosaic
code. After re-designing the BG mosaic, I ended up needing a separate
mosaic for Mode7, and in the process I fixed that bug. The obvious
change is that the Chrono Trigger Mode7->Mode2 transition doesn't cause
the pendulum to jump anymore.
Windows were simplified just a tad. The range testing is shared for all
modes now. Ironically, it's a bit slower, but I'll take less code over
more speed for the accuracy core.
Speaking of speed, because there's so much less calculations per pixel
for BGs, performance for the entire emulator has gone up by 30% in the
accuracy core. Pretty neat overall, I can maintain 60fps in all but,
yeah you can guess can't you?
2010-09-04 03:36:03 +00:00
|
|
|
}
|
|
|
|
|
2016-06-14 10:51:54 +00:00
|
|
|
if(mirrorX) for(auto n : range(2)) {
|
2017-09-05 00:56:52 +00:00
|
|
|
data[n] = (data[n] >> 4 & 0x0f0f0f0f) | (data[n] << 4 & 0xf0f0f0f0);
|
|
|
|
data[n] = (data[n] >> 2 & 0x33333333) | (data[n] << 2 & 0xcccccccc);
|
|
|
|
data[n] = (data[n] >> 1 & 0x55555555) | (data[n] << 1 & 0xaaaaaaaa);
|
Update to v068r12 release.
(there was no r11 release posted to the WIP thread)
byuu says:
This took ten hours of mind boggling insanity to pull off.
It upgrades the S-PPU dot-based renderer to fetch one tile, and then
output all of its pixels before fetching again. It sounds easy enough,
but it's insanely difficult. I ended up taking one small shortcut, in
that rather than fetch at -7, I fetch at the first instance where a tile
is needed to plot to x=0. So if you have {-3 to +4 } as a tile, it
fetches at -3. That won't work so well on hardware, if two BGs fetch at
the same X offset, they won't have time.
I have had no luck staggering the reads at BG1=-7, BG3=-5, etc. While
I can shift and fetch just fine, what happens is that when a new tile is
fetched in, that gives a new palette, priority, etc; and this ends up
happening between two tiles which results in the right-most edges of the
screen ending up with the wrong colors and such.
Offset-per-tile is cheap as always. Although looking at it, I'm not sure
how BG3 could pre-fetch, especially with the way one or two OPT modes
can fetch two tiles.
There's no magic in Hoffset caching yet, so the SMW1 pixel issue is
still there.
Mode 7 got a bugfix, it was off-by-one horizontally from the mosaic
code. After re-designing the BG mosaic, I ended up needing a separate
mosaic for Mode7, and in the process I fixed that bug. The obvious
change is that the Chrono Trigger Mode7->Mode2 transition doesn't cause
the pendulum to jump anymore.
Windows were simplified just a tad. The range testing is shared for all
modes now. Ironically, it's a bit slower, but I'll take less code over
more speed for the accuracy core.
Speaking of speed, because there's so much less calculations per pixel
for BGs, performance for the entire emulator has gone up by 30% in the
accuracy core. Pretty neat overall, I can maintain 60fps in all but,
yeah you can guess can't you?
2010-09-04 03:36:03 +00:00
|
|
|
}
|
|
|
|
}
|
2010-08-09 13:28:56 +00:00
|
|
|
|
Update to v095r05 release.
byuu says:
Changelog:
- GBA: lots of emulation improvements
- PPU PRAM is 16-bits wide
- DMA masks &~1/Half, &~3/Word
- VRAM OBJ 8-bit writes are ignored
- OAM 8-bit writes are ignored
- BGnCNT unused bits are writable*
- BG(0,1)CNT can't set the d13
- BLDALPHA is readable (fixes Donkey Kong Country, etc)
- SNES: lots of code cleanups
- sfc/chip => sfc/coprocessor
- UI: save most recent controller selection
GBA test scores: 1552/1552, 37/38, 1020/1260
(* forgot to add the value to the read function, so endrift's I/O tests
for them will fail. Fixed locally.)
Note: SNES is the only system with multiple controller/expansion port
options, and as such is the only one with a "None" option. Because it's
shared by the controller and expansion port, it ends up sorted first in
the list. This means that on your first run, you'll need to go to Super
Famicom->Controller Port 1 and select "Gamepad", otherwise input won't
work.
Also note that changing the expansion port device requires loading a new
cart. Unlike controllers, you aren't meant to hotplug expansion port
devices.
2015-11-12 10:15:03 +00:00
|
|
|
auto PPU::Background::run(bool screen) -> void {
|
2016-06-14 10:51:54 +00:00
|
|
|
if(ppu.vcounter() == 0) return;
|
Update to v068r12 release.
(there was no r11 release posted to the WIP thread)
byuu says:
This took ten hours of mind boggling insanity to pull off.
It upgrades the S-PPU dot-based renderer to fetch one tile, and then
output all of its pixels before fetching again. It sounds easy enough,
but it's insanely difficult. I ended up taking one small shortcut, in
that rather than fetch at -7, I fetch at the first instance where a tile
is needed to plot to x=0. So if you have {-3 to +4 } as a tile, it
fetches at -3. That won't work so well on hardware, if two BGs fetch at
the same X offset, they won't have time.
I have had no luck staggering the reads at BG1=-7, BG3=-5, etc. While
I can shift and fetch just fine, what happens is that when a new tile is
fetched in, that gives a new palette, priority, etc; and this ends up
happening between two tiles which results in the right-most edges of the
screen ending up with the wrong colors and such.
Offset-per-tile is cheap as always. Although looking at it, I'm not sure
how BG3 could pre-fetch, especially with the way one or two OPT modes
can fetch two tiles.
There's no magic in Hoffset caching yet, so the SMW1 pixel issue is
still there.
Mode 7 got a bugfix, it was off-by-one horizontally from the mosaic
code. After re-designing the BG mosaic, I ended up needing a separate
mosaic for Mode7, and in the process I fixed that bug. The obvious
change is that the Chrono Trigger Mode7->Mode2 transition doesn't cause
the pendulum to jump anymore.
Windows were simplified just a tad. The range testing is shared for all
modes now. Ironically, it's a bit slower, but I'll take less code over
more speed for the accuracy core.
Speaking of speed, because there's so much less calculations per pixel
for BGs, performance for the entire emulator has gone up by 30% in the
accuracy core. Pretty neat overall, I can maintain 60fps in all but,
yeah you can guess can't you?
2010-09-04 03:36:03 +00:00
|
|
|
|
2016-06-14 10:51:54 +00:00
|
|
|
if(screen == Screen::Below) {
|
|
|
|
output.above.priority = 0;
|
|
|
|
output.below.priority = 0;
|
2017-09-05 00:56:52 +00:00
|
|
|
if(!hires()) return;
|
Update to v068r12 release.
(there was no r11 release posted to the WIP thread)
byuu says:
This took ten hours of mind boggling insanity to pull off.
It upgrades the S-PPU dot-based renderer to fetch one tile, and then
output all of its pixels before fetching again. It sounds easy enough,
but it's insanely difficult. I ended up taking one small shortcut, in
that rather than fetch at -7, I fetch at the first instance where a tile
is needed to plot to x=0. So if you have {-3 to +4 } as a tile, it
fetches at -3. That won't work so well on hardware, if two BGs fetch at
the same X offset, they won't have time.
I have had no luck staggering the reads at BG1=-7, BG3=-5, etc. While
I can shift and fetch just fine, what happens is that when a new tile is
fetched in, that gives a new palette, priority, etc; and this ends up
happening between two tiles which results in the right-most edges of the
screen ending up with the wrong colors and such.
Offset-per-tile is cheap as always. Although looking at it, I'm not sure
how BG3 could pre-fetch, especially with the way one or two OPT modes
can fetch two tiles.
There's no magic in Hoffset caching yet, so the SMW1 pixel issue is
still there.
Mode 7 got a bugfix, it was off-by-one horizontally from the mosaic
code. After re-designing the BG mosaic, I ended up needing a separate
mosaic for Mode7, and in the process I fixed that bug. The obvious
change is that the Chrono Trigger Mode7->Mode2 transition doesn't cause
the pendulum to jump anymore.
Windows were simplified just a tad. The range testing is shared for all
modes now. Ironically, it's a bit slower, but I'll take less code over
more speed for the accuracy core.
Speaking of speed, because there's so much less calculations per pixel
for BGs, performance for the entire emulator has gone up by 30% in the
accuracy core. Pretty neat overall, I can maintain 60fps in all but,
yeah you can guess can't you?
2010-09-04 03:36:03 +00:00
|
|
|
}
|
|
|
|
|
2016-06-14 10:51:54 +00:00
|
|
|
if(tileCounter-- == 0) {
|
|
|
|
tileCounter = 7;
|
|
|
|
getTile();
|
Update to v068r14 release.
byuu says:
Holy hell, that was a total brain twister. After hours of crazy bit
twiddling and debug printf's, I finally figured out how to allow both
lores and hires scrolling in the accurate PPU renderer. In the process,
I modified the main loop to run from -7 to 255, regardless of the hires
setting, and perform X adjustment inside the tile fetching. This fixed
a strange main/subscreen misalignment issue, so I was able to restore
the proper sub-then-main rendering for the final screen output stage.
Code looks a good bit cleaner this way overall.
I also added load state and save state menus to the tools menu, so you
can use the menubar to load and save to ten slots. I am thinking that
I should nuke the icons. As pretty as they are, it's getting tiresome
trying to find icons for everything, I have no pictures to represent
loading or saving a slot, nor to represent individual slots. I'll just
stick to radios and checkboxes.
2010-09-05 23:08:03 +00:00
|
|
|
}
|
Update to v068r12 release.
(there was no r11 release posted to the WIP thread)
byuu says:
This took ten hours of mind boggling insanity to pull off.
It upgrades the S-PPU dot-based renderer to fetch one tile, and then
output all of its pixels before fetching again. It sounds easy enough,
but it's insanely difficult. I ended up taking one small shortcut, in
that rather than fetch at -7, I fetch at the first instance where a tile
is needed to plot to x=0. So if you have {-3 to +4 } as a tile, it
fetches at -3. That won't work so well on hardware, if two BGs fetch at
the same X offset, they won't have time.
I have had no luck staggering the reads at BG1=-7, BG3=-5, etc. While
I can shift and fetch just fine, what happens is that when a new tile is
fetched in, that gives a new palette, priority, etc; and this ends up
happening between two tiles which results in the right-most edges of the
screen ending up with the wrong colors and such.
Offset-per-tile is cheap as always. Although looking at it, I'm not sure
how BG3 could pre-fetch, especially with the way one or two OPT modes
can fetch two tiles.
There's no magic in Hoffset caching yet, so the SMW1 pixel issue is
still there.
Mode 7 got a bugfix, it was off-by-one horizontally from the mosaic
code. After re-designing the BG mosaic, I ended up needing a separate
mosaic for Mode7, and in the process I fixed that bug. The obvious
change is that the Chrono Trigger Mode7->Mode2 transition doesn't cause
the pendulum to jump anymore.
Windows were simplified just a tad. The range testing is shared for all
modes now. Ironically, it's a bit slower, but I'll take less code over
more speed for the accuracy core.
Speaking of speed, because there's so much less calculations per pixel
for BGs, performance for the entire emulator has gone up by 30% in the
accuracy core. Pretty neat overall, I can maintain 60fps in all but,
yeah you can guess can't you?
2010-09-04 03:36:03 +00:00
|
|
|
|
Update to v099r14 release.
byuu says:
Changelog:
- (u)int(max,ptr) abbreviations removed; use _t suffix now [didn't feel
like they were contributing enough to be worth it]
- cleaned up nall::integer,natural,real functionality
- toInteger, toNatural, toReal for parsing strings to numbers
- fromInteger, fromNatural, fromReal for creating strings from numbers
- (string,Markup::Node,SQL-based-classes)::(integer,natural,real)
left unchanged
- template<typename T> numeral(T value, long padding, char padchar)
-> string for print() formatting
- deduces integer,natural,real based on T ... cast the value if you
want to override
- there still exists binary,octal,hex,pointer for explicit print()
formatting
- lstring -> string_vector [but using lstring = string_vector; is
declared]
- would be nice to remove the using lstring eventually ... but that'd
probably require 10,000 lines of changes >_>
- format -> string_format [no using here; format was too ambiguous]
- using integer = Integer<sizeof(int)*8>; and using natural =
Natural<sizeof(uint)*8>; declared
- for consistency with boolean. These three are meant for creating
zero-initialized values implicitly (various uses)
- R65816::io() -> idle() and SPC700::io() -> idle() [more clear; frees
up struct IO {} io; naming]
- SFC CPU, PPU, SMP use struct IO {} io; over struct (Status,Registers) {}
(status,registers); now
- still some CPU::Status status values ... they didn't really fit into
IO functionality ... will have to think about this more
- SFC CPU, PPU, SMP now use step() exclusively instead of addClocks()
calling into step()
- SFC CPU joypad1_bits, joypad2_bits were unused; killed them
- SFC PPU CGRAM moved into PPU::Screen; since nothing else uses it
- SFC PPU OAM moved into PPU::Object; since nothing else uses it
- the raw uint8[544] array is gone. OAM::read() constructs values from
the OAM::Object[512] table now
- this avoids having to determine how we want to sub-divide the two
OAM memory sections
- this also eliminates the OAM::synchronize() functionality
- probably more I'm forgetting
The FPS fluctuations are driving me insane. This WIP went from 128fps to
137fps. Settled on 133.5fps for the final build. But nothing I changed
should have affected performance at all. This level of fluctuation makes
it damn near impossible to know whether I'm speeding things up or slowing
things down with changes.
2016-07-01 11:50:32 +00:00
|
|
|
if(io.mode == Mode::Mode7) return runMode7();
|
Update to v094r39 release.
byuu says:
Changelog:
- SNES mid-scanline BGMODE fixes finally merged (can run
atx2.zip{mode7.smc}+mtest(2).sfc properly now)
- Makefile now discards all built-in rules and variables
- switch on bool warning disabled for GCC now as well (was already
disabled for Clang)
- when loading a game, if any required files are missing, display
a warning message box (manifest.bml, program.rom, bios.rom, etc)
- when loading a game (or a game slot), if manifest.bml is missing, it
will invoke icarus to try and generate it
- if that fails (icarus is missing or the folder is bad), you will get
a warning telling you that the manifest can't be loaded
The warning prompt on missing files work for both games and the .sys
folders and their files. For some reason, failing to load the DMG/CGB
BIOS is causing a crash before I can display the modal dialog. I have no
idea why, and the stack frame backtrace is junk.
I also can't seem to abort the failed loading process. If I call
Program::unloadMedia(), I get a nasty segfault. Again with a really
nasty stack trace. So for now, it'll just end up sitting there emulating
an empty ROM (solid black screen.) In time, I'd like to fix that too.
Lastly, I need a better method than popen for Windows. popen is kind of
ugly and flashes a console window for a brief second even if the
application launched is linked with -mwindows. Not sure if there even is
one (I need to read the stdout result, so CreateProcess may not work
unless I do something nasty like "> %tmp%/temp") I'm also using the
regular popen instead of _wpopen, so for this WIP, it won't work if your
game folder has non-English letters in the path.
2015-08-04 09:00:55 +00:00
|
|
|
|
2017-09-05 00:56:52 +00:00
|
|
|
uint8 color = getTileColor();
|
|
|
|
Pixel pixel;
|
|
|
|
pixel.priority = priority;
|
|
|
|
pixel.palette = color ? paletteIndex + color : 0;
|
|
|
|
pixel.tile = tile;
|
|
|
|
|
|
|
|
if(x == 0) {
|
2017-09-06 02:38:00 +00:00
|
|
|
mosaic.hcounter = mosaic.size + 1;
|
2017-09-05 00:56:52 +00:00
|
|
|
mosaic.pixel = pixel;
|
2017-09-06 02:38:00 +00:00
|
|
|
} else if((!hires() || screen == Screen::Below) && --mosaic.hcounter == 0) {
|
2017-09-05 00:56:52 +00:00
|
|
|
mosaic.hcounter = mosaic.size + 1;
|
|
|
|
mosaic.pixel = pixel;
|
|
|
|
} else if(mosaic.enable) {
|
|
|
|
pixel = mosaic.pixel;
|
Update to v068r12 release.
(there was no r11 release posted to the WIP thread)
byuu says:
This took ten hours of mind boggling insanity to pull off.
It upgrades the S-PPU dot-based renderer to fetch one tile, and then
output all of its pixels before fetching again. It sounds easy enough,
but it's insanely difficult. I ended up taking one small shortcut, in
that rather than fetch at -7, I fetch at the first instance where a tile
is needed to plot to x=0. So if you have {-3 to +4 } as a tile, it
fetches at -3. That won't work so well on hardware, if two BGs fetch at
the same X offset, they won't have time.
I have had no luck staggering the reads at BG1=-7, BG3=-5, etc. While
I can shift and fetch just fine, what happens is that when a new tile is
fetched in, that gives a new palette, priority, etc; and this ends up
happening between two tiles which results in the right-most edges of the
screen ending up with the wrong colors and such.
Offset-per-tile is cheap as always. Although looking at it, I'm not sure
how BG3 could pre-fetch, especially with the way one or two OPT modes
can fetch two tiles.
There's no magic in Hoffset caching yet, so the SMW1 pixel issue is
still there.
Mode 7 got a bugfix, it was off-by-one horizontally from the mosaic
code. After re-designing the BG mosaic, I ended up needing a separate
mosaic for Mode7, and in the process I fixed that bug. The obvious
change is that the Chrono Trigger Mode7->Mode2 transition doesn't cause
the pendulum to jump anymore.
Windows were simplified just a tad. The range testing is shared for all
modes now. Ironically, it's a bit slower, but I'll take less code over
more speed for the accuracy core.
Speaking of speed, because there's so much less calculations per pixel
for BGs, performance for the entire emulator has gone up by 30% in the
accuracy core. Pretty neat overall, I can maintain 60fps in all but,
yeah you can guess can't you?
2010-09-04 03:36:03 +00:00
|
|
|
}
|
2016-06-14 10:51:54 +00:00
|
|
|
if(screen == Screen::Above) x++;
|
2017-09-05 00:56:52 +00:00
|
|
|
if(pixel.palette == 0) return;
|
2010-08-09 13:28:56 +00:00
|
|
|
|
2017-09-05 00:56:52 +00:00
|
|
|
if(!hires() || screen == Screen::Above) if(io.aboveEnable) output.above = pixel;
|
|
|
|
if(!hires() || screen == Screen::Below) if(io.belowEnable) output.below = pixel;
|
2010-08-09 13:28:56 +00:00
|
|
|
}
|
|
|
|
|
2016-06-14 10:51:54 +00:00
|
|
|
auto PPU::Background::getTileColor() -> uint {
|
|
|
|
uint color = 0;
|
2012-02-18 07:49:52 +00:00
|
|
|
|
Update to v099r14 release.
byuu says:
Changelog:
- (u)int(max,ptr) abbreviations removed; use _t suffix now [didn't feel
like they were contributing enough to be worth it]
- cleaned up nall::integer,natural,real functionality
- toInteger, toNatural, toReal for parsing strings to numbers
- fromInteger, fromNatural, fromReal for creating strings from numbers
- (string,Markup::Node,SQL-based-classes)::(integer,natural,real)
left unchanged
- template<typename T> numeral(T value, long padding, char padchar)
-> string for print() formatting
- deduces integer,natural,real based on T ... cast the value if you
want to override
- there still exists binary,octal,hex,pointer for explicit print()
formatting
- lstring -> string_vector [but using lstring = string_vector; is
declared]
- would be nice to remove the using lstring eventually ... but that'd
probably require 10,000 lines of changes >_>
- format -> string_format [no using here; format was too ambiguous]
- using integer = Integer<sizeof(int)*8>; and using natural =
Natural<sizeof(uint)*8>; declared
- for consistency with boolean. These three are meant for creating
zero-initialized values implicitly (various uses)
- R65816::io() -> idle() and SPC700::io() -> idle() [more clear; frees
up struct IO {} io; naming]
- SFC CPU, PPU, SMP use struct IO {} io; over struct (Status,Registers) {}
(status,registers); now
- still some CPU::Status status values ... they didn't really fit into
IO functionality ... will have to think about this more
- SFC CPU, PPU, SMP now use step() exclusively instead of addClocks()
calling into step()
- SFC CPU joypad1_bits, joypad2_bits were unused; killed them
- SFC PPU CGRAM moved into PPU::Screen; since nothing else uses it
- SFC PPU OAM moved into PPU::Object; since nothing else uses it
- the raw uint8[544] array is gone. OAM::read() constructs values from
the OAM::Object[512] table now
- this avoids having to determine how we want to sub-divide the two
OAM memory sections
- this also eliminates the OAM::synchronize() functionality
- probably more I'm forgetting
The FPS fluctuations are driving me insane. This WIP went from 128fps to
137fps. Settled on 133.5fps for the final build. But nothing I changed
should have affected performance at all. This level of fluctuation makes
it damn near impossible to know whether I'm speeding things up or slowing
things down with changes.
2016-07-01 11:50:32 +00:00
|
|
|
switch(io.mode) {
|
2012-02-18 07:49:52 +00:00
|
|
|
case Mode::BPP8:
|
2016-06-15 11:32:17 +00:00
|
|
|
color += data[1] >> 24 & 0x80;
|
|
|
|
color += data[1] >> 17 & 0x40;
|
|
|
|
color += data[1] >> 10 & 0x20;
|
|
|
|
color += data[1] >> 3 & 0x10;
|
2016-06-14 10:51:54 +00:00
|
|
|
data[1] <<= 1;
|
2012-02-18 07:49:52 +00:00
|
|
|
case Mode::BPP4:
|
2016-06-14 10:51:54 +00:00
|
|
|
color += data[0] >> 28 & 0x08;
|
|
|
|
color += data[0] >> 21 & 0x04;
|
2012-02-18 07:49:52 +00:00
|
|
|
case Mode::BPP2:
|
2016-06-14 10:51:54 +00:00
|
|
|
color += data[0] >> 14 & 0x02;
|
|
|
|
color += data[0] >> 7 & 0x01;
|
|
|
|
data[0] <<= 1;
|
Update to v068r12 release.
(there was no r11 release posted to the WIP thread)
byuu says:
This took ten hours of mind boggling insanity to pull off.
It upgrades the S-PPU dot-based renderer to fetch one tile, and then
output all of its pixels before fetching again. It sounds easy enough,
but it's insanely difficult. I ended up taking one small shortcut, in
that rather than fetch at -7, I fetch at the first instance where a tile
is needed to plot to x=0. So if you have {-3 to +4 } as a tile, it
fetches at -3. That won't work so well on hardware, if two BGs fetch at
the same X offset, they won't have time.
I have had no luck staggering the reads at BG1=-7, BG3=-5, etc. While
I can shift and fetch just fine, what happens is that when a new tile is
fetched in, that gives a new palette, priority, etc; and this ends up
happening between two tiles which results in the right-most edges of the
screen ending up with the wrong colors and such.
Offset-per-tile is cheap as always. Although looking at it, I'm not sure
how BG3 could pre-fetch, especially with the way one or two OPT modes
can fetch two tiles.
There's no magic in Hoffset caching yet, so the SMW1 pixel issue is
still there.
Mode 7 got a bugfix, it was off-by-one horizontally from the mosaic
code. After re-designing the BG mosaic, I ended up needing a separate
mosaic for Mode7, and in the process I fixed that bug. The obvious
change is that the Chrono Trigger Mode7->Mode2 transition doesn't cause
the pendulum to jump anymore.
Windows were simplified just a tad. The range testing is shared for all
modes now. Ironically, it's a bit slower, but I'll take less code over
more speed for the accuracy core.
Speaking of speed, because there's so much less calculations per pixel
for BGs, performance for the entire emulator has gone up by 30% in the
accuracy core. Pretty neat overall, I can maintain 60fps in all but,
yeah you can guess can't you?
2010-09-04 03:36:03 +00:00
|
|
|
}
|
2012-02-18 07:49:52 +00:00
|
|
|
|
Update to v068r12 release.
(there was no r11 release posted to the WIP thread)
byuu says:
This took ten hours of mind boggling insanity to pull off.
It upgrades the S-PPU dot-based renderer to fetch one tile, and then
output all of its pixels before fetching again. It sounds easy enough,
but it's insanely difficult. I ended up taking one small shortcut, in
that rather than fetch at -7, I fetch at the first instance where a tile
is needed to plot to x=0. So if you have {-3 to +4 } as a tile, it
fetches at -3. That won't work so well on hardware, if two BGs fetch at
the same X offset, they won't have time.
I have had no luck staggering the reads at BG1=-7, BG3=-5, etc. While
I can shift and fetch just fine, what happens is that when a new tile is
fetched in, that gives a new palette, priority, etc; and this ends up
happening between two tiles which results in the right-most edges of the
screen ending up with the wrong colors and such.
Offset-per-tile is cheap as always. Although looking at it, I'm not sure
how BG3 could pre-fetch, especially with the way one or two OPT modes
can fetch two tiles.
There's no magic in Hoffset caching yet, so the SMW1 pixel issue is
still there.
Mode 7 got a bugfix, it was off-by-one horizontally from the mosaic
code. After re-designing the BG mosaic, I ended up needing a separate
mosaic for Mode7, and in the process I fixed that bug. The obvious
change is that the Chrono Trigger Mode7->Mode2 transition doesn't cause
the pendulum to jump anymore.
Windows were simplified just a tad. The range testing is shared for all
modes now. Ironically, it's a bit slower, but I'll take less code over
more speed for the accuracy core.
Speaking of speed, because there's so much less calculations per pixel
for BGs, performance for the entire emulator has gone up by 30% in the
accuracy core. Pretty neat overall, I can maintain 60fps in all but,
yeah you can guess can't you?
2010-09-04 03:36:03 +00:00
|
|
|
return color;
|
|
|
|
}
|
|
|
|
|
Update to v102r02 release.
byuu says:
Changelog:
- I caved on the `samples[] = {0.0}` thing, but I'm very unhappy about it
- if it's really invalid C++, then GCC needs to stop accepting it
in strict `-std=c++14` mode
- Emulator::Interface::Information::resettable is gone
- Emulator::Interface::reset() is gone
- FC, SFC, MD cores updated to remove soft reset behavior
- split GameBoy::Interface into GameBoyInterface,
GameBoyColorInterface
- split WonderSwan::Interface into WonderSwanInterface,
WonderSwanColorInterface
- PCE: fixed off-by-one scanline error [hex_usr]
- PCE: temporary hack to prevent crashing when VDS is set to < 2
- hiro: Cocoa: removed (u)int(#) constants; converted (u)int(#)
types to (u)int_(#)t types
- icarus: replaced usage of unique with strip instead (so we don't
mess up frameworks on macOS)
- libco: added macOS-specific section marker [Ryphecha]
So ... the major news this time is the removal of the soft reset
behavior. This is a major!! change that results in a 100KiB diff file,
and it's very prone to accidental mistakes!! If anyone is up for
testing, or even better -- looking over the code changes between v102r01
and v102r02 and looking for any issues, please do so. Ideally we'll want
to test every NES mapper type and every SNES coprocessor type by loading
said games and power cycling to make sure the games are all cleanly
resetting. It's too big of a change for me to cover there not being any
issues on my own, but this is truly critical code, so yeah ... please
help if you can.
We technically lose a bit of hardware documentation here. The soft reset
events do all kinds of interesting things in all kinds of different
chips -- or at least they do on the SNES. This is obviously not ideal.
But in the process of removing these portions of code, I found a few
mistakes I had made previously. It simplifies resetting the system state
a lot when not trying to have all the power() functions call the reset()
functions to share partial functionality.
In the future, the goal will be to come up with a way to add back in the
soft reset behavior via keyboard binding as with the Master System core.
What's going to have to happen is that the key binding will have to send
a "reset pulse" to every emulated chip, and those chips are going to
have to act independently to power() instead of reusing functionality.
We'll get there eventually, but there's many things of vastly greater
importance to work on right now, so it'll be a while. The information
isn't lost ... we'll just have to pull it out of v102 when we are ready.
Note that I left the SNES reset vector simulation code in, even though
it's not possible to trigger, for the time being.
Also ... the Super Game Boy core is still disconnected. To be honest, it
totally slipped my mind when I released v102 that it wasn't connected
again yet. This one's going to be pretty tricky to be honest. I'm
thinking about making a third GameBoy::Interface class just for SGB, and
coming up with some way of bypassing platform-> calls when in this
mode.
2017-01-22 21:04:26 +00:00
|
|
|
auto PPU::Background::power() -> void {
|
2017-09-05 00:56:52 +00:00
|
|
|
io = {};
|
Update to v104r04 release.
byuu says:
Changelog:
- higan/emulator: added new Random class with three entropy settings:
none, low, and high
- md/vdp: corrected Vcounter readout in interlace mode [MoD]
- sfc: updated core to use the new Random class; defaults to high
entropy
No entropy essentially returns 0, unless the random.bias(n) function is
called, in which case, it returns n. In this case, n is meant to be the
"logical/ideal" default value that maximizes compatibility with games.
Low entropy is a very simple entropy modeled after RAM initialization
striping patterns (eg 32 0x00s, followed by 32 0xFFs, repeating
throughout.) It doesn't "glitch" like real hardware does on rare
occasions (parts of the pattern being broken from time to time.) It also
only really returns 0 or ~0. So the entropy is indeed extremely low, and
not very useful at all for detecting bugs. Over time, we can try to
improve this, of course.
High entropy is PCG. This replaces the older, lower-entropy and more
predictable, LFSR. PCG should be more than enough for emulator
randomness, while still being quite fast.
Unfortunately, the bad news ... both no entropy and low entropy fix the
Konami logo popping sound in Prince of Persia, but all three entropy
settings still cause the distortion in-game, especially evident at the
title screen. So ... this may be a more serious bug than first
suspected.
2017-08-24 02:45:24 +00:00
|
|
|
io.tiledataAddress = (random() & 0x0f) << 12;
|
|
|
|
io.screenAddress = (random() & 0xfc) << 8;
|
|
|
|
io.screenSize = random();
|
|
|
|
io.tileSize = random();
|
|
|
|
io.aboveEnable = random();
|
|
|
|
io.belowEnable = random();
|
|
|
|
io.hoffset = random();
|
|
|
|
io.voffset = random();
|
2016-06-14 10:51:54 +00:00
|
|
|
|
2017-09-05 00:56:52 +00:00
|
|
|
latch = {};
|
Update to v068r12 release.
(there was no r11 release posted to the WIP thread)
byuu says:
This took ten hours of mind boggling insanity to pull off.
It upgrades the S-PPU dot-based renderer to fetch one tile, and then
output all of its pixels before fetching again. It sounds easy enough,
but it's insanely difficult. I ended up taking one small shortcut, in
that rather than fetch at -7, I fetch at the first instance where a tile
is needed to plot to x=0. So if you have {-3 to +4 } as a tile, it
fetches at -3. That won't work so well on hardware, if two BGs fetch at
the same X offset, they won't have time.
I have had no luck staggering the reads at BG1=-7, BG3=-5, etc. While
I can shift and fetch just fine, what happens is that when a new tile is
fetched in, that gives a new palette, priority, etc; and this ends up
happening between two tiles which results in the right-most edges of the
screen ending up with the wrong colors and such.
Offset-per-tile is cheap as always. Although looking at it, I'm not sure
how BG3 could pre-fetch, especially with the way one or two OPT modes
can fetch two tiles.
There's no magic in Hoffset caching yet, so the SMW1 pixel issue is
still there.
Mode 7 got a bugfix, it was off-by-one horizontally from the mosaic
code. After re-designing the BG mosaic, I ended up needing a separate
mosaic for Mode7, and in the process I fixed that bug. The obvious
change is that the Chrono Trigger Mode7->Mode2 transition doesn't cause
the pendulum to jump anymore.
Windows were simplified just a tad. The range testing is shared for all
modes now. Ironically, it's a bit slower, but I'll take less code over
more speed for the accuracy core.
Speaking of speed, because there's so much less calculations per pixel
for BGs, performance for the entire emulator has gone up by 30% in the
accuracy core. Pretty neat overall, I can maintain 60fps in all but,
yeah you can guess can't you?
2010-09-04 03:36:03 +00:00
|
|
|
|
2017-09-05 00:56:52 +00:00
|
|
|
output.above = {};
|
|
|
|
output.below = {};
|
Update to v068r12 release.
(there was no r11 release posted to the WIP thread)
byuu says:
This took ten hours of mind boggling insanity to pull off.
It upgrades the S-PPU dot-based renderer to fetch one tile, and then
output all of its pixels before fetching again. It sounds easy enough,
but it's insanely difficult. I ended up taking one small shortcut, in
that rather than fetch at -7, I fetch at the first instance where a tile
is needed to plot to x=0. So if you have {-3 to +4 } as a tile, it
fetches at -3. That won't work so well on hardware, if two BGs fetch at
the same X offset, they won't have time.
I have had no luck staggering the reads at BG1=-7, BG3=-5, etc. While
I can shift and fetch just fine, what happens is that when a new tile is
fetched in, that gives a new palette, priority, etc; and this ends up
happening between two tiles which results in the right-most edges of the
screen ending up with the wrong colors and such.
Offset-per-tile is cheap as always. Although looking at it, I'm not sure
how BG3 could pre-fetch, especially with the way one or two OPT modes
can fetch two tiles.
There's no magic in Hoffset caching yet, so the SMW1 pixel issue is
still there.
Mode 7 got a bugfix, it was off-by-one horizontally from the mosaic
code. After re-designing the BG mosaic, I ended up needing a separate
mosaic for Mode7, and in the process I fixed that bug. The obvious
change is that the Chrono Trigger Mode7->Mode2 transition doesn't cause
the pendulum to jump anymore.
Windows were simplified just a tad. The range testing is shared for all
modes now. Ironically, it's a bit slower, but I'll take less code over
more speed for the accuracy core.
Speaking of speed, because there's so much less calculations per pixel
for BGs, performance for the entire emulator has gone up by 30% in the
accuracy core. Pretty neat overall, I can maintain 60fps in all but,
yeah you can guess can't you?
2010-09-04 03:36:03 +00:00
|
|
|
|
2017-09-05 00:56:52 +00:00
|
|
|
mosaic = {};
|
|
|
|
mosaic.size = random();
|
|
|
|
mosaic.enable = random();
|
2010-09-09 06:34:54 +00:00
|
|
|
|
2011-06-05 03:25:24 +00:00
|
|
|
x = 0;
|
|
|
|
y = 0;
|
Update to v068r12 release.
(there was no r11 release posted to the WIP thread)
byuu says:
This took ten hours of mind boggling insanity to pull off.
It upgrades the S-PPU dot-based renderer to fetch one tile, and then
output all of its pixels before fetching again. It sounds easy enough,
but it's insanely difficult. I ended up taking one small shortcut, in
that rather than fetch at -7, I fetch at the first instance where a tile
is needed to plot to x=0. So if you have {-3 to +4 } as a tile, it
fetches at -3. That won't work so well on hardware, if two BGs fetch at
the same X offset, they won't have time.
I have had no luck staggering the reads at BG1=-7, BG3=-5, etc. While
I can shift and fetch just fine, what happens is that when a new tile is
fetched in, that gives a new palette, priority, etc; and this ends up
happening between two tiles which results in the right-most edges of the
screen ending up with the wrong colors and such.
Offset-per-tile is cheap as always. Although looking at it, I'm not sure
how BG3 could pre-fetch, especially with the way one or two OPT modes
can fetch two tiles.
There's no magic in Hoffset caching yet, so the SMW1 pixel issue is
still there.
Mode 7 got a bugfix, it was off-by-one horizontally from the mosaic
code. After re-designing the BG mosaic, I ended up needing a separate
mosaic for Mode7, and in the process I fixed that bug. The obvious
change is that the Chrono Trigger Mode7->Mode2 transition doesn't cause
the pendulum to jump anymore.
Windows were simplified just a tad. The range testing is shared for all
modes now. Ironically, it's a bit slower, but I'll take less code over
more speed for the accuracy core.
Speaking of speed, because there's so much less calculations per pixel
for BGs, performance for the entire emulator has gone up by 30% in the
accuracy core. Pretty neat overall, I can maintain 60fps in all but,
yeah you can guess can't you?
2010-09-04 03:36:03 +00:00
|
|
|
|
2016-06-14 10:51:54 +00:00
|
|
|
tileCounter = 0;
|
Update to v068r12 release.
(there was no r11 release posted to the WIP thread)
byuu says:
This took ten hours of mind boggling insanity to pull off.
It upgrades the S-PPU dot-based renderer to fetch one tile, and then
output all of its pixels before fetching again. It sounds easy enough,
but it's insanely difficult. I ended up taking one small shortcut, in
that rather than fetch at -7, I fetch at the first instance where a tile
is needed to plot to x=0. So if you have {-3 to +4 } as a tile, it
fetches at -3. That won't work so well on hardware, if two BGs fetch at
the same X offset, they won't have time.
I have had no luck staggering the reads at BG1=-7, BG3=-5, etc. While
I can shift and fetch just fine, what happens is that when a new tile is
fetched in, that gives a new palette, priority, etc; and this ends up
happening between two tiles which results in the right-most edges of the
screen ending up with the wrong colors and such.
Offset-per-tile is cheap as always. Although looking at it, I'm not sure
how BG3 could pre-fetch, especially with the way one or two OPT modes
can fetch two tiles.
There's no magic in Hoffset caching yet, so the SMW1 pixel issue is
still there.
Mode 7 got a bugfix, it was off-by-one horizontally from the mosaic
code. After re-designing the BG mosaic, I ended up needing a separate
mosaic for Mode7, and in the process I fixed that bug. The obvious
change is that the Chrono Trigger Mode7->Mode2 transition doesn't cause
the pendulum to jump anymore.
Windows were simplified just a tad. The range testing is shared for all
modes now. Ironically, it's a bit slower, but I'll take less code over
more speed for the accuracy core.
Speaking of speed, because there's so much less calculations per pixel
for BGs, performance for the entire emulator has gone up by 30% in the
accuracy core. Pretty neat overall, I can maintain 60fps in all but,
yeah you can guess can't you?
2010-09-04 03:36:03 +00:00
|
|
|
tile = 0;
|
|
|
|
priority = 0;
|
2016-06-14 10:51:54 +00:00
|
|
|
paletteNumber = 0;
|
|
|
|
paletteIndex = 0;
|
2017-09-05 00:56:52 +00:00
|
|
|
for(auto& word : data) word = 0;
|
Update to v068r12 release.
(there was no r11 release posted to the WIP thread)
byuu says:
This took ten hours of mind boggling insanity to pull off.
It upgrades the S-PPU dot-based renderer to fetch one tile, and then
output all of its pixels before fetching again. It sounds easy enough,
but it's insanely difficult. I ended up taking one small shortcut, in
that rather than fetch at -7, I fetch at the first instance where a tile
is needed to plot to x=0. So if you have {-3 to +4 } as a tile, it
fetches at -3. That won't work so well on hardware, if two BGs fetch at
the same X offset, they won't have time.
I have had no luck staggering the reads at BG1=-7, BG3=-5, etc. While
I can shift and fetch just fine, what happens is that when a new tile is
fetched in, that gives a new palette, priority, etc; and this ends up
happening between two tiles which results in the right-most edges of the
screen ending up with the wrong colors and such.
Offset-per-tile is cheap as always. Although looking at it, I'm not sure
how BG3 could pre-fetch, especially with the way one or two OPT modes
can fetch two tiles.
There's no magic in Hoffset caching yet, so the SMW1 pixel issue is
still there.
Mode 7 got a bugfix, it was off-by-one horizontally from the mosaic
code. After re-designing the BG mosaic, I ended up needing a separate
mosaic for Mode7, and in the process I fixed that bug. The obvious
change is that the Chrono Trigger Mode7->Mode2 transition doesn't cause
the pendulum to jump anymore.
Windows were simplified just a tad. The range testing is shared for all
modes now. Ironically, it's a bit slower, but I'll take less code over
more speed for the accuracy core.
Speaking of speed, because there's so much less calculations per pixel
for BGs, performance for the entire emulator has gone up by 30% in the
accuracy core. Pretty neat overall, I can maintain 60fps in all but,
yeah you can guess can't you?
2010-09-04 03:36:03 +00:00
|
|
|
}
|
|
|
|
|
Update to v106r33 release.
byuu says:
Changelog:
- nall/GNUmakefile: added `openmp=(true,false)` option; can be toggled
when building higan/bsnes
- defaults to disabled on macOS, because Xcode doesn't stupidly
doesn't ship with support for it
- higan/GNUmakefile: forgot to switch target,profile back from
bsnes,fast to higan,accurate
- this is just gonna happen from time to time, sorry
- sfc/dsp: when using the fast profile, the DSP syncs per sample
instead of per clock
- should only negatively impact Koushien 2, but is a fairly
significant speedup otherwise
- sfc/ppc,ppu-fast: optimized the code a bit (ppu 130fps to 133fps)
- sfc/ppu-fast: basic vertical mosaic support (not accurate, but
should look okay hopefully)
- sfc/ppu-fast: added missing mode7 hflip support
- sfc/ppu-fast: added support to render at 256-width and/or 240-height
- gives a decent speed boost, and also allows all of the older
quark shaders to work nicely again
- it does violate the contract of Emulator::Interface, but oh
well, it works fine in the bsnes GUI
- sfc/ppu-fast: use cached CGRAM values for mode7 and sprites
- sfc/ppu-fast: use global range/time over flags in object rendering
- may not actually work as we intended since it's a race condition
even if it's only ORing the flags
- really don't want to have to make those variables atomic if I
don't have to
- sfc/ppu-fast: should fully support interlace and overscan modes now
- hiro/cocoa: updated macOS Gatekeeper disable support to work on
10.13+
- ruby: forgot to fix macOS input driver, sorry
- nall/GNUmakefile: if uname is present, then just default to rm
instead of del (fixes Msys)
Note: blur emulation option will break pretty badly in 256x240 output
mode. I'll fix it later.
2018-05-31 07:06:55 +00:00
|
|
|
auto PPU::Background::getTile(uint x, uint y) -> uint16 {
|
|
|
|
uint tileHeight = 3 + io.tileSize;
|
2017-09-05 00:56:52 +00:00
|
|
|
uint tileWidth = !hires() ? tileHeight : 4;
|
Update to v106r33 release.
byuu says:
Changelog:
- nall/GNUmakefile: added `openmp=(true,false)` option; can be toggled
when building higan/bsnes
- defaults to disabled on macOS, because Xcode doesn't stupidly
doesn't ship with support for it
- higan/GNUmakefile: forgot to switch target,profile back from
bsnes,fast to higan,accurate
- this is just gonna happen from time to time, sorry
- sfc/dsp: when using the fast profile, the DSP syncs per sample
instead of per clock
- should only negatively impact Koushien 2, but is a fairly
significant speedup otherwise
- sfc/ppc,ppu-fast: optimized the code a bit (ppu 130fps to 133fps)
- sfc/ppu-fast: basic vertical mosaic support (not accurate, but
should look okay hopefully)
- sfc/ppu-fast: added missing mode7 hflip support
- sfc/ppu-fast: added support to render at 256-width and/or 240-height
- gives a decent speed boost, and also allows all of the older
quark shaders to work nicely again
- it does violate the contract of Emulator::Interface, but oh
well, it works fine in the bsnes GUI
- sfc/ppu-fast: use cached CGRAM values for mode7 and sprites
- sfc/ppu-fast: use global range/time over flags in object rendering
- may not actually work as we intended since it's a race condition
even if it's only ORing the flags
- really don't want to have to make those variables atomic if I
don't have to
- sfc/ppu-fast: should fully support interlace and overscan modes now
- hiro/cocoa: updated macOS Gatekeeper disable support to work on
10.13+
- ruby: forgot to fix macOS input driver, sorry
- nall/GNUmakefile: if uname is present, then just default to rm
instead of del (fixes Msys)
Note: blur emulation option will break pretty badly in 256x240 output
mode. I'll fix it later.
2018-05-31 07:06:55 +00:00
|
|
|
uint screenX = io.screenSize.bit(0) ? 32 << 5 : 0;
|
|
|
|
uint screenY = io.screenSize.bit(1) ? 32 << 5 + io.screenSize.bit(0) : 0;
|
|
|
|
uint tileX = x >> tileWidth;
|
|
|
|
uint tileY = y >> tileHeight;
|
|
|
|
uint16 offset = (tileY & 0x1f) << 5 | (tileX & 0x1f);
|
|
|
|
if(tileX & 0x20) offset += screenX;
|
|
|
|
if(tileY & 0x20) offset += screenY;
|
Update to v099r14 release.
byuu says:
Changelog:
- (u)int(max,ptr) abbreviations removed; use _t suffix now [didn't feel
like they were contributing enough to be worth it]
- cleaned up nall::integer,natural,real functionality
- toInteger, toNatural, toReal for parsing strings to numbers
- fromInteger, fromNatural, fromReal for creating strings from numbers
- (string,Markup::Node,SQL-based-classes)::(integer,natural,real)
left unchanged
- template<typename T> numeral(T value, long padding, char padchar)
-> string for print() formatting
- deduces integer,natural,real based on T ... cast the value if you
want to override
- there still exists binary,octal,hex,pointer for explicit print()
formatting
- lstring -> string_vector [but using lstring = string_vector; is
declared]
- would be nice to remove the using lstring eventually ... but that'd
probably require 10,000 lines of changes >_>
- format -> string_format [no using here; format was too ambiguous]
- using integer = Integer<sizeof(int)*8>; and using natural =
Natural<sizeof(uint)*8>; declared
- for consistency with boolean. These three are meant for creating
zero-initialized values implicitly (various uses)
- R65816::io() -> idle() and SPC700::io() -> idle() [more clear; frees
up struct IO {} io; naming]
- SFC CPU, PPU, SMP use struct IO {} io; over struct (Status,Registers) {}
(status,registers); now
- still some CPU::Status status values ... they didn't really fit into
IO functionality ... will have to think about this more
- SFC CPU, PPU, SMP now use step() exclusively instead of addClocks()
calling into step()
- SFC CPU joypad1_bits, joypad2_bits were unused; killed them
- SFC PPU CGRAM moved into PPU::Screen; since nothing else uses it
- SFC PPU OAM moved into PPU::Object; since nothing else uses it
- the raw uint8[544] array is gone. OAM::read() constructs values from
the OAM::Object[512] table now
- this avoids having to determine how we want to sub-divide the two
OAM memory sections
- this also eliminates the OAM::synchronize() functionality
- probably more I'm forgetting
The FPS fluctuations are driving me insane. This WIP went from 128fps to
137fps. Settled on 133.5fps for the final build. But nothing I changed
should have affected performance at all. This level of fluctuation makes
it damn near impossible to know whether I'm speeding things up or slowing
things down with changes.
2016-07-01 11:50:32 +00:00
|
|
|
uint16 address = io.screenAddress + offset;
|
Update to v099r07 release.
byuu says:
Changelog:
- (hopefully) fixed BS Memory and Sufami Turbo slot loading
- ported GB, GBA, WS cores to use nall/vfs
- completely removed loadRequest, saveRequest functionality from
Emulator::Interface and ui-tomoko
- loadRequest(folder) is now load(folder)
- save states now use a shared Emulator::SerializerVersion string
- whenever this is bumped, all older states will break; but this makes
bumping state versions way easier
- also, the version string makes it a lot easier to identify
compatibility windows for save states
- SNES PPU now uses uint16 vram[32768] for memory accesses [hex_usr]
NOTE: Super Game Boy loading is currently broken, and I'm not entirely
sure how to fix it :/
The file loading handoff was -really- complicated, and so I'm kind of
at a loss ... so for now, don't try it.
Everything else should theoretically work, so please report any bugs
you find.
So, this is pretty much it. I'd be very curious to hear feedback from
people who objected to the old nall/stream design, whether they are
happy with the new file loading system or think it could use further
improvements.
The 16-bit VRAM turned out to be a wash on performance (roughly the same
as before. 1fps slower on Zelda 3, 1fps faster on Yoshi's Island.) The
main reason for this was because Yoshi's Island was breaking horribly
until I changed the vramRead, vramWrite functions to take uint15 instead
of uint16.
I suspect the issue is we're using uint16s in some areas now that need
to be uint15, and this game is setting the VRAM address to 0x8000+,
causing us to go out of bounds on memory accesses.
But ... I want to go ahead and do something cute for fun, and just because
we can ... and this new interface is so incredibly perfect for it!! I
want to support an SNES unit with 128KiB of VRAM. Not out of the box,
but as a fun little tweakable thing. The SNES was clearly designed to
support that, they just didn't use big enough VRAM chips, and left one
of the lines disconnected. So ... let's connect it anyway!
In the end, if we design it right, the only code difference should be
one area where we mask by 15-bits instead of by 16-bits.
2016-06-24 12:09:30 +00:00
|
|
|
return ppu.vram[address];
|
2010-08-09 13:28:56 +00:00
|
|
|
}
|