Commit Graph

204 Commits

Author SHA1 Message Date
Gregory Hainaut 16c2baa0df gsdx-ogl: don't hardcode the accumulation blend trick
Perf impact is bigger than expected
2016-07-07 19:56:23 +02:00
Gregory Hainaut d7c1faf563 gsdx ogl: add GPU timers to measure time between 2 vsync
So far only basic stuff (min/mean/max frame rendering time)
2016-06-26 15:34:36 +02:00
Gregory Hainaut 79587215bb gsdx ogl: add the option force_texture_clear for test purpose
Might be completely useless.

1 => always clear framebuffers and textures to black (aka 0)
2 => always clear framebuffers and textures to red
2016-06-24 18:41:55 +02:00
Gregory Hainaut 3234c8241b gsdx ogl: massively extend glsl self test
* Support Mesa Nouveau IR (free driver for Nvidia's GPU)
  => Print intermediate representation + final shader
  => Dump GPR usage
* Move dumped shader in /tmp/GSdx_Shader/<sub_dir>
  =>  Avoid the landing of 3 thousands of files in $PWD ^^
* Use function instead of macro
2016-06-11 13:34:37 +02:00
Gregory Hainaut 5d49a6b685 gsdx ogl: replace 4 VS shader variation by an AND mask
Perf will be roughly the same. However there is a single VS for all
the HW emulation.
2016-06-01 09:29:56 +02:00
Gregory Hainaut 959abe64f8 gsdx ogl: implement wildhack on the CPU
Speed impact is likely small and the plan is only to keep a single Vertex Shader
2016-06-01 09:29:56 +02:00
Gregory Hainaut f7ddd488e1 gsdx ogl: Extend uniform buffer with channel parameter
Instead to use the standard ps ubo which is used every draw call.
I reused a barely used buffer to reduce the extra cost of the upload
2016-05-29 10:13:43 +02:00
Jonathan Li da2046e90e gsdx: Use alignas instead of __aligned
__aligned is defined in FreeBSD headers and will cause compile errors.
2016-05-21 13:23:11 +01:00
Gregory Hainaut 3ab12cef2f gsdx ogl: accelerate special case of accurate date.
Game often uses date to allow a single pixel pass. If this
use case is detected, stencil buffer will be cleared after first pixels
that pass both depth&stencil test.

It seems to reduce the load on the GPU.

Note: with the help of texture barriere, maybe we could implement the algo
with a single pass.
2016-05-15 17:22:58 +02:00
Gregory Hainaut 5b061e062c gsdx ogl: replace ClearRenderTarget_i by glClearTexSubImage
Avoid state change, avoid potential texture buffer reallocation

Note: require GL_ARB_clear_texture
2016-05-15 15:55:31 +02:00
Gregory Hainaut d41613c46a gsdx ogl: add a Tales of Abyss HLE shader
Again fast and efficient but it relies on CRC

v2: forget to update the precompiled shader...
2016-05-07 22:46:41 +02:00
Gregory Hainaut 14e1ed06df glsl: add an HLE shader for Urban Chaos
Pro:
* Replace 140 draw calls into a single one
* No complex texture conversion/lookup
* smaller solution than a generic solution
2016-04-30 14:52:53 +02:00
Gregory Hainaut c445a14c46 gsdx ogl: extend shader to lookup a single channel 2016-04-28 22:56:38 +02:00
Gregory Hainaut fda511a949 gsdx glsl: extend hw shader to sample depth texture
Will use integral coordinate to avoid any rescaling.

Bilinear interpolation isn't supported. I don't think it is allowed to
filter a depth texture anyway.
2016-04-24 22:18:26 +02:00
Gregory Hainaut 4281b8630b gsdx ogl: remove the useless shadeboost Constant Buffer 2016-04-24 11:08:14 +02:00
Gregory Hainaut 3f404c8edb gsdx-ogl: update shader pipeline intertace to use vs/gs/ps triplet
Better to have 1 function calls with 3 parameters rather 3 functions call with 1 parameter.
2016-04-10 14:14:30 +02:00
Gregory Hainaut f751f70b1e gsdx ogl: GL_ARB_clip_control is now mandatory 2016-04-07 21:57:54 +02:00
Gregory Hainaut b8a023d158 gsdx ogl: mark OGL object as final
Give the compiler more devirtualization hint
2016-04-05 00:01:43 +02:00
Gregory Hainaut decac5fd12 gsdx ogl: implement an empty BeginScene
Compile will devirtualize it and then remove it during the inline.
2016-04-04 23:32:11 +02:00
Gregory Hainaut e3787b6b3c gsdx-ogl: use final qualifier to help compiler
Improve Devirtualization optimization
2016-04-04 22:52:59 +02:00
Gregory Hainaut bc73195193 gsdx-ogl: pack more tightly the FS UBO
Merge TA vec2 + Af vec1 into a single vec4
2016-03-10 18:29:05 +01:00
Gregory Hainaut f8c442cf76 gsdx-ogl: make VS more generic
Texture coordinate could be dummy/float/int integral/int normalized.

Old behavior:
* VS was in charge to select the texture coordinate
* int integral format wasn't supported

New behavior:
* Always compute all formats
* FS will be in charge to select the good format

Impact:
* VS will be slightly slower but it reduces shaders permutation from
   little to 0 (won't be bad for CPU)
* FS speed isn't impacted as 2 separate code paths were already required
  to support both format
* Rasterizer will be 33% slower but unlikely to be the limited factor of
  the GPU
* In future we could directly use the integral format in the FS.

V2: remove useless PSin_t
2016-02-18 19:02:05 +01:00
Gregory Hainaut 6561fbc831 gsdx-ogl: only enable aniso when sampling from the HW texture unit
Potentially help issue #884
2015-10-22 12:21:43 +02:00
Jonathan Li cbd2417833 gsdx:ogl:windows: Fix calling convention mismatch
OpenGL does not use the cdecl calling convention (which is the default
calling convention for GSdx on Windows). Since DebugOutputToFile is used
by OpenGL, it needs to use the same calling convention that OpenGL uses.

This fixes a debug build crash when the OpenGL renderers are used and
debug_opengl is nonzero in the ini.
2015-09-26 22:38:05 +01:00
Gregory Hainaut 5fb8c7e65c gsdx: initialize members in constructor of objects
A couple of useless members were removed too.

Also fix wnd initialization

Coverity:
CID 146955 (#1 of 1): Uninitialized pointer read (UNINIT)
18. uninit_use: Using uninitialized value wnd[i].
2015-09-23 09:46:53 +02:00
Gregory Hainaut 78569ee833 gsdx-ogl: redo properly the setup of texture format
* add lengthly comment to explain the format
* Likely reduce the number of shader permutation
* Avoid slow AEM (on GPU)

Expect regressions because TC needs some fixes

v2: fix palette mode
2015-09-11 14:14:22 +02:00
Gregory Hainaut ca9b5ce11d gsdx-ogl: move depth conversion shader
Add 2 new shaders:
* ps_main12: cast a 16 bit depth to a RGB5A1 color
* ps_main16: cast a a RGB5A1 color to a 16 bit depth

Shader might be used in future commit as it seems Silent Hill uses this
kind of format.

Fix tab/indentation too
2015-09-08 12:41:05 +02:00
Gregory Hainaut 7002ff3ec3 gsdx-ogl: move texture scale from vs_cb to fs_cb
It avoid useless update of the vs_cb.
2015-08-22 13:33:04 +02:00
Gregory Hainaut 6b84a89b6a gsdx-ogl: remove old ati hack for point sampler
Was never used on openGL

In the future absolute coordinate will be use anyway
2015-08-18 19:09:45 +02:00
Gregory Hainaut c7b2f9d1d2 gsdx-ogl: setup fbo to always write to first buffer
GL_NONE was kind of useless because nothing was attached anyway
2015-08-15 20:06:34 +02:00
Gregory Hainaut 61694013a5 gsdx-ogl: compact blending parameter structure
Save 656B of data. It is good for the cache.
2015-08-09 13:44:30 +02:00
Gregory Hainaut df3ade896b gsdx-ogl: use integer for blend factor
Integer argument&comparison might be lighter

V2: Forget to change one OMSetBlendState call
2015-08-09 13:44:05 +02:00
Gregory Hainaut 5b57405517 gsdx-ogl: blend management cleanup
* reorder the blend function
* remove OM bsel object
* add a bit to support pabe (miss the glsl part)
2015-08-08 09:18:09 +02:00
Gregory Hainaut 921b22ac31 gsdx-ogl: drop useless parameter of OMSetDepthStencilState 2015-08-05 22:55:12 +02:00
Gregory Hainaut b7e16b5989 gsdx-ogl: clean the blending management
Intially GSBlendStateOGL was an alias of the m_blendMapD3D9 array

The object was replaced by an index in the array. Save 2k of memory duplication.
And too much useless code.

v2: push/pop blending state in DATE stuff
v3: remove m_state which is useless now
2015-08-05 22:55:12 +02:00
Gregory Hainaut 744f9ebc09 gsdx-ogl: rare corner case when both texture shuffle and date are enabled
In texture shuffle mode the texture data is either RG or BA. It means
that DATE must either checks MSB of G or A.

Close #693
2015-08-04 20:10:44 +02:00
Gregory Hainaut e972f4f4dd gsdx-ogl: extend device to support an offset for normal draw 2015-08-02 21:30:19 +02:00
Gregory Hainaut 59cdf77784 gsdx-ogl: create a new function to set the blending state 2015-08-02 21:30:19 +02:00
Gregory Hainaut 1da611fb75 gsdx-ogl: clean PS selector 2015-08-02 21:30:19 +02:00
Gregory Hainaut 4a3c145c72 gsdx-ogl: depth support: better support of 16 bits z buffer
Fix issue in socom2
2015-08-01 01:28:41 +02:00
Gregory Hainaut 97b38d9e1b gsdx-ogl: directly set impossible mode in the blending table
Avoid to hack it in the creation

Allow in the future to reuse the table directly instead of converting
in a blend object
2015-07-31 09:45:28 +02:00
Gregory Hainaut 8554f32086 gsdx-ogl: clean the blend table
Remove old shader define
Prefix macro with BLEND_
Add some notes to explain the special symbol
2015-07-31 09:45:28 +02:00
Gregory Hainaut cfd0fd6cc8 gsdx-ogl: remove old colclip algo 2015-07-31 09:45:28 +02:00
Gregory Hainaut 93c47feb7c gsdx-ogl: replace old colclip algo with the HDR algo
Similar speed but more accurate

Allow to clean the code
2015-07-31 09:45:28 +02:00
Gregory Hainaut 83f874db93 gsdx-ogl: remove bsel.ps
Just clear bsel.abe to disable blending
2015-07-31 09:45:28 +02:00
Gregory Hainaut 83dfc6b633 gsdx-ogl: clean a bit selector code
Use countof macro (avoid to duplicate the size)
Fix the size of array
Remove useless alpha_stencil case
2015-07-30 18:36:05 +02:00
Gregory Hainaut e026f1bac6 gsdx-ogl: implement a fast accurate colclip algo
The idea is to use a floating texture to accumulate the data and
then do a final postprocessing pass to apply the modulo

v2:
* use bounding box to
* fix vertex corruption issue
* use negative number in shader which allow to use half float (+12
  fps@4x)
2015-07-30 18:34:52 +02:00
Gregory Hainaut ee9edb0b19 gsdx-ogl: create a copy rect with conversion function
Previous CopyRect function does a memcopy without conversion.

This function will allow to use different format for input/output. Just a
possibility for the future
2015-07-30 18:22:59 +02:00
Gregory Hainaut ae8df002af gsdx-ogl: optimize Cs * As + Cd and Cs * Af + Cd blending
Basically the code does the alpha multiplication in the shader therefore
the blend unit only does a pure addition. This way the multiplication is
accurate and accurate_blending doesn't requires a costly barrier.

This code also avoid variable duplication to make the code more separated.
Hopefully blending can be done in a separated function

It is preliminary work to support fast color clipping with HDR

v2: fix assertion compilation failure

v3: fix regression in not accurate mode

v3: Cs * As/Af is not an accumulation

Those cases don't need the Cd addition and were already optimized anyway

Fix a regression on GoW2
2015-07-30 18:22:59 +02:00
Gregory Hainaut 5c740ff41e gsdx-ogl: wipeout AlphaStencil & Alpha hack
Accurate options do a better jobs. Technically it can still
be useful for old gpu/driver that doesn't support the GL4.5 extension.

On Windows, you can still rely on Dx

On linux, free driver support it (except Intel)
2015-07-19 22:43:48 +02:00