* Support Mesa Nouveau IR (free driver for Nvidia's GPU)
=> Print intermediate representation + final shader
=> Dump GPR usage
* Move dumped shader in /tmp/GSdx_Shader/<sub_dir>
=> Avoid the landing of 3 thousands of files in $PWD ^^
* Use function instead of macro
Legacy GPU:
Older driver will be broken.
Still supported GPU:
Please upgrade to the latest AMD driver 16.5.2 or 16.5.3 (and prey that future driver will still work)
Game often uses date to allow a single pixel pass. If this
use case is detected, stencil buffer will be cleared after first pixels
that pass both depth&stencil test.
It seems to reduce the load on the GPU.
Note: with the help of texture barriere, maybe we could implement the algo
with a single pass.
If a color buffer is still attached and is smaller than depth buffer,
the latter won't be fully cleared.
As a faster alternative, use GL4.4 clear texture function. Avoid to fiddle with
framebuffer and pixel tests.
Fix #1362x Ar Tolenico 2 map clip
Until AMD release the driver with a fix, I can't use 2nd blending source with SSO.
So let's use the first source. Blending/Alpha will be wrong. But it is likely better
than an uninitialized alpha value.
Will use integral coordinate to avoid any rescaling.
Bilinear interpolation isn't supported. I don't think it is allowed to
filter a depth texture anyway.
* keep a reference of program/pipeline created to ease the deletion
* extend a bit the API to support multiple pipeline
Final goal will be to use a pre link pipeline for SW shaders. And uses
the default pipeline for HW shaders.
Texture coordinate could be dummy/float/int integral/int normalized.
Old behavior:
* VS was in charge to select the texture coordinate
* int integral format wasn't supported
New behavior:
* Always compute all formats
* FS will be in charge to select the good format
Impact:
* VS will be slightly slower but it reduces shaders permutation from
little to 0 (won't be bad for CPU)
* FS speed isn't impacted as 2 separate code paths were already required
to support both format
* Rasterizer will be 33% slower but unlikely to be the limited factor of
the GPU
* In future we could directly use the integral format in the FS.
V2: remove useless PSin_t
In the DATE42 algo, first pass must find the primitive that write the
bad alpha value. If depth test is fail, alpha value won't be written therefore
you mustn't keep the primitive id.
In theory to ensure 100% correctness, depth would need to be fully executed
(currently depth write is disabled). However it requires to copy the depth buffer.
It is likely bad for the perf.
Issue reported on DBZInfWorld
Function pointer was mangled to avoid any collision. Nowadays all symbols
are hidden so no risk of collision.
Syntax is nicer beside it would allow to put back GLES3.2. I think it
supports most of the used extension.
glActiveTexture & glBlendColor are provided without symbol query.
A couple of useless members were removed too.
Also fix wnd initialization
Coverity:
CID 146955 (#1 of 1): Uninitialized pointer read (UNINIT)
18. uninit_use: Using uninitialized value wnd[i].
gsdx changes:
Remove native resolution checkbox from GUI and rework associated code
Small changes to Windows and Linux GUI
Support 8x native resolution
Fix custom resolution width less than native width use case
upscale_multiplier function values have been changed to allocate native resolution and also move custom resolution to 9.
Remove the old native checkbox value and include Native in the combo box.
Internal GSDX functions have also been updated with this new update to the upscale_multiplier variable.
* add lengthly comment to explain the format
* Likely reduce the number of shader permutation
* Avoid slow AEM (on GPU)
Expect regressions because TC needs some fixes
v2: fix palette mode
Add 2 new shaders:
* ps_main12: cast a 16 bit depth to a RGB5A1 color
* ps_main16: cast a a RGB5A1 color to a 16 bit depth
Shader might be used in future commit as it seems Silent Hill uses this
kind of format.
Fix tab/indentation too
Removes the checkbox of Anisotropic filtering from the GSDX plugin settings, the checkbox was usually used to enable & disable the AF which is not necessary since there is an option in the drop down list for disabling AF.
the internal function value of "AnisotropicFiltering" has been replaced with "MaxAnisotropy" for detection.
the detection uses the function getconfig("MaxAnisotropy", value) where value 0 means disabled and value is the default value when no value is set in the INI file.
Intially GSBlendStateOGL was an alias of the m_blendMapD3D9 array
The object was replaced by an index in the array. Save 2k of memory duplication.
And too much useless code.
v2: push/pop blending state in DATE stuff
v3: remove m_state which is useless now
The idea is to use a floating texture to accumulate the data and
then do a final postprocessing pass to apply the modulo
v2:
* use bounding box to
* fix vertex corruption issue
* use negative number in shader which allow to use half float (+12
fps@4x)
Previous CopyRect function does a memcopy without conversion.
This function will allow to use different format for input/output. Just a
possibility for the future
Basically the code does the alpha multiplication in the shader therefore
the blend unit only does a pure addition. This way the multiplication is
accurate and accurate_blending doesn't requires a costly barrier.
This code also avoid variable duplication to make the code more separated.
Hopefully blending can be done in a separated function
It is preliminary work to support fast color clipping with HDR
v2: fix assertion compilation failure
v3: fix regression in not accurate mode
v3: Cs * As/Af is not an accumulation
Those cases don't need the Cd addition and were already optimized anyway
Fix a regression on GoW2
Accurate options do a better jobs. Technically it can still
be useful for old gpu/driver that doesn't support the GL4.5 extension.
On Windows, you can still rely on Dx
On linux, free driver support it (except Intel)
GS uses integer value and does integer operation too.
This commit trunc the sampled texture, the interpoled fragment color
and the product of the 2.
It impacts negatively the perf of about 3/4% (GPU) but it fixes rendering on
suikoden and potentially some others games too.
Nvidia allows to get the ASM of the shader of the compiled shader. It is useful
to check the performance.
It also allow me to compile most of shader code path for QA
Dump is enabled in linux replayer + debug_glsl_shader = 2