GSOffset is already based on a lookup of PSM/BP/BW. Coverage only adds
the size parameters (so only 256 possibilities)
It replaces the hash lookup with a free array access.
The hypothesis is that game will use a depth (aka Z32/Z24/Z16/Z16S)
format when sampling depth texture as color. Technically one could use
a standard color format but block/pixel order won't be the same.
(otherwise I'm screwed)
=> Hypothesis invalid on GoW. They just do a scrambled rendering...
Lookup info:
* The first searched list is the depth pool as we search a depth
* 2nd one is the render target pool (if a depth was converted to a
render target already)
To avoid any CPU overhead, the source will be a pointer to the real texture
* Conversion (if float texture) will be done on the fly by the shader (GPU).
* Relative rescaling won't be supported. Texture must be fetched with
integral coordinate
Cache page coverage of texture into a hash map
Test done on Champion of Norrath (paltex + DisablePartialInvalidation)
Self of GSTextureCache::SourceMap::Add 5.39% => 0.23%
Self of GSTextureCache::LookupSource 15.27% => 10.82%
Hard to measure on CoN as it depends on memory transfer. Seem to be 5-10 fps faster.
Someone ought to add the Windows option too (and DisablePartialInvalidation too)
It might break a couple of games but most of them run better with depth enabled.
* Silent Hill 2 doesn't need the CRC hack
* GSRenderer: no need to explicitly set bottom value for r.
* Texture Cache: Removed a check which couldn't possibly enter true
Try to avoid random black screen frame
v2: don't force the preload hack on the frame
It creates a ghost image over FMV
v3: support offset within a frame
It often happens the game try to upload the FMV directly which typically
gave a black screen.
Commit fix rules of roses and I hope various black screen FMV
Performance impact must be tested, and I'm afraid of strange texture cache behavior.
V2: check the size of the transfer too
V3: add support of 16 bits format
V4: avoid division by 0
It actually removes the previous hack that read the full target.
Unfortunately snowblind engine game uses big target so the read is very big too (1280x448)
which is killer for the perf. Whereas the game requires only 24x12 texels
Give a 2x speed boost on Champion of Norrath !!!
Games uses very special texture with a lots of repeating.
It is much faster to send the full texture rather than trying to partially invalidate it.
On my gs dump:
FPS: 29 => 68 !
Avoid a crash on Onimusha3 (PAL 60HZ)
In theory it will be better to find the root cause of overflow. I.e. somewhere in this
code below. Dirty rectangle is too big.
if(rowsize > 0 && offset % rowsize == 0)
int y = GSLocalMemory::m_psm[psm].pgs.y * offset / rowsize;
if(r.bottom > y)
GL_CACHE("TC: Dirty After Target(%s) %d (0x%x)", to_string(type),
t->m_texture ? t->m_texture->GetID() : 0,
// TODO: do not add this rect above too
t->m_dirty.push_back(GSDirtyRect(GSVector4i(r.left, - y, r.right, r.bottom - y), psm));
t->m_TEX0.TBW = bw;
So as a temporary solution (that will likely stay for a couple of
years), buffers were increased.
Height of the dirty rectangle must be the GS size of the RT. Of course
RT doesn't have any height so we compute the max safest value.
Fix issue #987
Candidate for 1.4 release
1. Add GS_Renderer Enum
Replace all instances of int/uint32 renderer identifier by a strongly
typed enum and appropriate casts.
Only instances in GS[*].cpp/h classes were touched. GPU[*].cpp/h classes
do not to follow the same convention.
2. Add default renderer according to OS
The default renderer is OS dependent (Win -> Dx9HW, others -> OGLHW).
Consequently one should always check againt the appropriate default
value on config load.
The old behaviour was only - if a at all - problematic if the respective
element in the gsdx.ini was missing and probably even then didn't create
issues. The current implementation is still more stable and does not
depend on the implementation of GS.cpp -> GetConfig()
The goal is to check the impact on game that have wrong RT content.
It helps a bit Smash Court Tennis Pro Tournament 2 but the game suffers
another texture cache bug. (RT BW is 10 whereas texture BW is 8)
Note: Armored Core: Last Raven must be tested (only game so far
that rely on the option and I didn't want to add a new one).
Typical wrong draw:
1/ draw in 32 bits
2/ draw in 24 bits
3/ Use alpha as a texure. (Must reuse the GPU data)
4/ Write alpha from EE
5/ Use alpha as a texure. (Must upload new data)
This commit fixes the step 5.
Fix#917 (Conflict - Desert Storm)
A couple of useless members were removed too.
Also fix wnd initialization
CID 146955 (#1 of 1): Uninitialized pointer read (UNINIT)
18. uninit_use: Using uninitialized value wnd[i].
gsdx changes:
Remove native resolution checkbox from GUI and rework associated code
Small changes to Windows and Linux GUI
Support 8x native resolution
Fix custom resolution width less than native width use case
* Greatly reduce the number of clut read (factor 10x)
* Avoid to get wrong TEXA texture in the cache.
* Fix "jump depends on uninitialized variable" Valgrind warning.
I try my best to avoid any breakage of DX but please test it too.
Add 2 new shaders:
* ps_main12: cast a 16 bit depth to a RGB5A1 color
* ps_main16: cast a a RGB5A1 color to a 16 bit depth
Shader might be used in future commit as it seems Silent Hill uses this
kind of format.
Fix tab/indentation too
Partially invalidate RT when there is a write in the middle of it (actually 2 pages below)
Code is not yet enabled because
1/ I want to stabilize latest update
2/ not sure of the impact of the code
3/ maybe it need a more generic version
Frame is always 32 bits but game can reuse it later as a 16 bits RT.
Fix half screen issue with Ricky Ponting Cricket
Unfortunately it triggers texture shuffle wrongly. I hope there is no
It might save a couple of fps
Add a define to test the perf if we keep only the blue channel. It brokes
the code in Prince Of Persia that use the Red/Green channel... Maybe the
speed hack :( Or find a way to replace all if with a lookup table
Note: it is only supported on OpenGL currently
Code unscale the texture to ease the conversion. Quality is awful (same as before)
but I'm not sure we can support an upscaled texture
Maybe the quality loss is due to the reduction without mipmap
Maybe the best solution will be to add an hack to extract the blue channel
(with texture swizzle), and uses a "full page/screen" spirte instead.
(it would be faster too)
Note: won't be compatible with MSAA (but gl doesn't support it anyway)
// In theory new textures contain invalidated data. Still in theory a new target
// must contains the content of the GS memory.
// In practice, TC will wrongly invalidate some RT. For example due to write on the alpha
// channel but colors is still valid. Unfortunately TC doesn't support the upload of data
// in target.
// Cleaning the code here will likely break several games. However it might reduce
// the noise in draw call debugging. It is the main reason to enable it on debug build.
// From a performance point of view, it might cost a little on big upscaling
// but normally few RT are miss so it must remain reasonable.
Game can directly uploads a background or the full image in
"CTRC" buffer. Previous code was a full black screen.
It will also avoid various black screen issue in gs dump.
hidden option: preload_frame_with_gs_data
Note: impact on upscaling was not tested and it's likely broken
Improve the rendering in MGS3 (even if the game is still broken
due to others TC issues)
// Typical bug (MGS3 blue cloud):
// 1/ RT used as 32 bits => alpha channel written
// 2/ RT used as 24 bits => no update of alpha channel
// 3/ Lookup of texture that used alpha channel as index, HasSharedBits will return false
// because of the previous draw call format
// Solution: consider the RT as 32 bits if the alpha was used in the past
It avoid various upscaling glitches on GS post-processing effect
// 1/ Palette is used to interpret the alpha channel of the RT as an index.
// Star Ocean 3 uses it to emulate a stencil buffer.
// 2/ Z formats are a bad idea to interpolate (discontinuties).
// 3/ 16 bits buffer is used to move data from a channel to another.
// I keep linear filtering for standard color even if I'm not sure that it is
// working correctly.
// Indeed, texture is reduced so you need to read all covered pixels (9 in 3x)
// to correctly interpolate the value. Linear interpolation is likely acceptable
// only in 2x scaling
// Src texture will still be bilinear interpolated so I'm really not sure
// that we need to do it here too.
// Future note: instead to do
// RT 2048x2048 -> T 1024x1024 -> RT 2048x2048
// We can maybe sample directly a bigger texture
// RT 2048x2048 -> T 2048x2048 -> RT 2048x2048
// Pro: better quality. Copy instead of StretchRect (must be faster)
// Cons: consume more memory
// In distant future: investigate to reuse the RT directly without any
// copy. Likely a speed boost and memory usage reduction.
It seems to impacts lots of games that still have issue (VP2, MTG3, PoP)
The PSMT32 format is read a PSMT8. I think we need to convert it as PSMT8H (i.e.
unpack it to have only an alpha channel)
GS doesn't supports texture shuffle/swizzle so it is emulated in a
complex way.
The idea is to read/write the 32 bits color format as a 16 bit format.
This way, RG (16 lsb bits) or BA (16 msb bits) can be read or written with
square texture that targets pixels 1-8 or pixels 8-16.
However shuffle is limited. For example you can copy the green channel
to either the alpha channel or another green channel.
Note: Partial masking of channel is not yet implemented
V2: improve logging
V3: better support of green channel in shader
V4: improve detection of destination (issue due to rounding)
When the RT is used as an input texture, we need to rescale it.
Previous behavior was to always uses a linear filtering (more smooth).
Unfortunately it broke some games that expected an exact value like Star Ocean 3
This commit will disable the linear filtering in normal filtering mode (filter = 0
or filter = 2)
This way, shadow of Star Ocean 3 will appear correctly in upscaling (not
100% perfect but can't do better)
Note: SO3 only requires a nearest sampling of the alpha channel but
I don't know the behavior for others games.