Commit Graph

125 Commits

Author SHA1 Message Date
Gregory Hainaut 08291aed0c gsdx-ogl: color state impact the clean command 2015-05-15 15:25:45 +02:00
Gregory Hainaut b7a9465963 gsdx-ogl: update the device to use the new texture flags
Hopefully it will increase a bit the speed
2015-05-12 18:18:20 +02:00
Gregory Hainaut f37f3cb3cf gsdx-ogl: improve texture uploading
Initially we copy pitch by line in the PBO and tell the dma to only
use the first valid byte.

Now, we only copy useful data to the PBO. It reduce the copy and PBO memory requirement.

It seems a bit faster on native resolution
2015-05-11 16:32:13 +02:00
Gregory Hainaut 921fa3bab8 gsdx-ogl/linux: drop fba option
It is useless as DX11
2015-05-11 13:57:26 +02:00
Gregory Hainaut 30e3956ea7 gsdx-ogl: avoid issue with apitrace
The tool doesn't support NaN value so be sure the buffer is correctly initialized
2015-05-11 11:19:18 +02:00
Gregory Hainaut 1523b9534f gsdx-debug: compact the code 2015-05-11 11:19:00 +02:00
Gregory Hainaut f029e4763f gsdx-ogl: add the alpha value in a constant buffer 2015-05-09 15:02:34 +02:00
Gregory Hainaut 625d65d4b4 gsdx-ogl: encode the bogus id as shader parameter 2015-05-09 14:55:44 +02:00
Gregory Hainaut 380e420cdd gsdx-ogl: add a blend parameter to shader 2015-05-08 20:28:50 +02:00
Gregory Hainaut 8e1db43431 gsdx-debug: debug stuff 2015-05-08 19:28:17 +02:00
Gregory Hainaut 1addae1993 gsdx-ogl: don't use extra shader for sprite hack
Atst == 2 && sprite_hack is equivalent to Atst == 1

It frees a bit in the shader selector, and reduces shader combinations.
2015-05-08 19:28:16 +02:00
Gregory Hainaut d6448183d7 gsdx-ogl-debug: insert error message in GL stream
This way, you can see your message in the GL debugger
2015-05-08 00:16:31 +02:00
Gregory Hainaut 51a67029cf gsdx-ogl: add an option to print gl error messages 2015-05-07 23:54:22 +02:00
Gregory Hainaut 6095f40baf gsdx-ogl: add the number of free bit in selector structs 2015-05-07 18:41:10 +02:00
Gregory Hainaut 530e4ce776 gsdx-ogl: drop hack that rescale primitive (to avoid upscale glitch)
Rendering is bad. It renders sprites at native resolution.
2015-05-06 19:09:13 +02:00
Gregory Hainaut 6d65867b26 gsdx-ogl: comment point_sampler
It is not enabled on the shader so I will reuse the bit
2015-05-06 19:09:12 +02:00
Gregory Hainaut 3fcac07120 gsdx-ogl: print some message when blending is not properly supported 2015-05-06 19:09:12 +02:00
Gregory Hainaut 5f5b901bca gsdx-ogl: alpha bending equation/function are constant
Drop useless variables/state checking
2015-05-05 11:20:25 +02:00
Gregory Hainaut 8032e2c369 gsdx-ogl: separate color mask state from the blending state
Unlike DX they're uncorrelated.
2015-05-05 10:26:01 +02:00
Gregory Hainaut 14a1925de0 gsdx ogl: date texture is signed to use i variant 2015-05-02 16:53:34 +02:00
Gregory Hainaut 335695bd0e purge GLES from GSdx !
mobile will use vulkan (or any new API) anyway
2015-05-01 20:02:17 +02:00
Gregory Hainaut 71c26a829e gsdx-ogl: forget to remove this useless call
It is done once in Device Creation instead of per stencil object
2015-05-01 12:10:03 +02:00
Gregory Hainaut ee19a2789c gsdx: move invalidation from GSDevice to GSTexture
Much cleaner this way
2015-04-30 19:55:57 +02:00
Gregory Hainaut 2bd9043657 gsdx-ogl: improve debug of stencil
Note: ENABLE_OGL_STENCIL_DEBUG could be dropped now that I can read the stencil properly
2015-04-30 19:55:57 +02:00
Gregory Hainaut ee244071fa gsdx-ogl: use 64 bits counter + fix division factor
I also added a counter of the real size of the texture.

I have a bad overhead for pbo transfer
2015-04-25 14:18:21 +02:00
Gregory Hainaut 757726bb91 gsdx-ogl: allow to invalidate the texture
It just a hint to the driver to avoid any useless transfer

I don't expect any change but it is free so why not ;)
2015-04-25 12:50:12 +02:00
Gregory Hainaut c207632e49 gsdx-ogl: improve date performance for GL45
If there is no overlap, it is allowed to directly read from the render target.

On SotC testcase with 6x scaling: 30fps -> 40fps

Note: it requires GL_ARB_texture_barrier extension so be sure to have a recent driver

Note2: it requires a lots of testing too

Open question: in case of complex date (written alpha)
Will it be faster to split the draw call into multiple call with no
primitive overlap
2015-04-24 21:12:33 +02:00
Gregory Hainaut 795ae50ecd gsdx-ogl: fix the recently broken advance date feature
Now it is really working with a 2 stages shaders but it is still slow.
2015-04-24 20:13:38 +02:00
Gregory Hainaut 672e3f9533 gsdx-ogl: use DSA for texture management
Yeah code is much nicer :)
2015-04-24 19:34:17 +02:00
Gregory Hainaut 31f8c065db gsdx-ogl: implement a new hack UserHacks_UnscaleSprite for opengl
UserHacks_UnscaleSprite = 1 will unscale flat sprites
UserHacks_UnscaleSprite = 2 will unscale all  sprites (don't work well so far)

The idea of the hack is to redo the interpolation of texture coordinate
based on the non-upscaled pixel position.

It avoids various glitches but sprites aren't upscaled anymore (so no
more anti-aliasing, potentially a coefficient can be added).
2015-04-20 07:18:08 +02:00
Gregory Hainaut 6124eb844e gsdx-ogl: only compile useful VS
logz is a constant
wildhack is only compatbile with TME/FST

Compilation goes down from 64 to 20 vertex shaders.
2015-04-20 07:17:58 +02:00
Gregory Hainaut 418f2e69a8 gsdx-ogl: implement the wildhack on the GPU
Likely much faster for opengl and much easier to implement

Note: hopefully UserHacks_round_sprite_offset will replace it
2015-04-13 22:14:36 +02:00
Gregory Hainaut 920ac6695f gsdx-ogl: add preliminary support of external shader fx 2014-11-10 10:37:58 +01:00
Gregory Hainaut 84f844767c gsdx-ogl: micro optimize PSConstantBuffer cache
Might help to save a cache line on the CPU :)
2014-11-08 21:39:17 +01:00
Gregory Hainaut 6fe9ee387d gsdx-ogl: optimize the PS cb cache
* Don't use the stack
* Don't compare MinMax parameter (depends of others)
* Don't store not-compared parameter in the cache (HalfTexel/MinMax)

+0.3fps/46fps (well better than nothing)
2014-10-02 20:44:22 +02:00
Gregory Hainaut ccc1137e12 gsdx-ogl: merge the two vertex buffer format
* Only a single VAO
    => Format is set once
    => Only a single bind at startup
    => GSVertexBufferStateOGL is nearly useless
    => barely faster but better than nothing :)
2014-10-02 20:44:22 +02:00
Gregory Hainaut 10c7be8c50 gsdx-ogl: Use 32B strides for all VBO 2014-10-02 20:44:22 +02:00
Gregory Hainaut 1e86e3cb08 gsdx-ogl: rework callback debug
* use DebugOutputToFile as a callback of gl error. Add a breakpoint to
  find the culprit GL call
* use string instead of char[n]

Note: CheckDebugLog is potentially useless now
2014-09-28 12:00:34 +02:00
Gregory Hainaut 0d45e6d70e gsdx-ogl: avoid to send constant to the GPU
It was a waste of bandwith
2014-04-06 10:44:40 +02:00
Gregory Hainaut b020bd76c6 gsdx-ogl: restore gles build
Add the --gles build option to the linux main script
Ifdef all gl code not supported on gles3 (note some will be reenabled for gles3.1)

Note: it probably doesn't run anymore. My Nvidia driver doesn't support
yet egl/gles so I can't test it. Feel free to contribute.
2014-03-29 11:55:02 +01:00
Gregory Hainaut 8b78551b92 gsdx-ogl: improve debugging capabilities
allow to print memory transfer usage
Check gl call in dev build
2014-03-25 16:36:29 +01:00
refraction 5b14ca0fb9 GSDX: Clear up all compiler warnings. No changes to emulation.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5840 96395faa-99c1-11dd-bbfe-3dabce05a288
2014-01-26 00:58:21 +00:00
gregory.hainaut c2aa4ff3fd gsdx ogl:
* restore the old fxaa (Asmodeam will be integrated when I got time)
* port the recently added new scanline algo


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5818 96395faa-99c1-11dd-bbfe-3dabce05a288
2014-01-18 14:46:13 +00:00
gregory.hainaut 2238095a82 gsdx ogl: do the same as previous commit but for ogl ;)
* fix the missing auto interlace opt when cycling with hotkey on linux


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5810 96395faa-99c1-11dd-bbfe-3dabce05a288
2014-01-12 11:38:50 +00:00
gregory.hainaut e80b002929 gsdx ogl: Flush various pending work
* try to use more subroutine on VS&PS, unfortunately hit a driver crash!
* Call Attach/DetachContext through GSDevice so I can unmap currently mapped buffer
* Implement glsl part of GL_ARB_bindless texture, again hit another driver crash!
* various fix of GL_ARB_buffer_storage. Basic benchmark show only improvement on 'cold' case, I guess it will improve smoothness
* try to fix GL_clear_texture, no success so far. It seem the extension is limited to basic texture (aka no depth/stencil)



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5752 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-10-24 20:54:27 +00:00
gregory.hainaut e01c6cd9ce gsdx ogl: the proof of concept commit
* GL_ARB_shader_subroutine for perf
fix for nvidia => add missing shader declaration. Nvidia got +4fps on colin3 :) 

For the moment only 2 PS parameters are supported. Code need to be extended to support others games that often
switch shader program (like xenosaga).
require GL4 class hardware and the option override_GL_ARB_shader_subroutine = 1
Note: strangely on AMD linux it is slower!

* GL_ARB_shader_image_load_store for accuraccy (Date)
Use a signed integer texture and reenable color buffer writing

Current status: Amagami_transparency.gs & P3_battle_shadows.gs are now working on Nvidia with a small perf impact.
Current implementation detail:
1/ setup the standard stencil as before
2/ on remaining pixel, draw once to compute first primitive that will write a fail alpha value.
3/ final draw based on primitive id of step 2

Note: I think we would get a bad behavior if depth test&mask are enabled on step 2/3
Note2: on my limited testcase the perf impact was on CPU. It would be possible to merge step1&2 to nullifying it (could 
even be faster actually), however it would require more GPU power.

Again require GL4 class hardware. And the option UserHacks_DateGL4 = 1



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5725 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-28 08:44:16 +00:00
gregory.hainaut 07605941ef gsdx ogl:
* some preliminary work to test/benchmark bindless texture in the future (glsl was not yet updated)
Bindless texture allow to get a GPU texture pointer and then set it directly
to the shader as a basic uniform.
=> no more texture unit selection/validation
=> no more texture validation neither texture hash lookup

3rdparty: update gl header to the latest gl4.4


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5720 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-17 08:57:52 +00:00
gregory.hainaut b4084047be gsdx ogl: Used a basic flat interpolation for color interpolation (line & tri primitives)
Card that support gs:
remain only a gs to generate sprite from a line. Even dummy gs are costly for the GPU.

Card that don't support gs:
remove useless copy of color for line and triangle primitives

Note for dx: opengl 3.2 (maybe not gles) supports both flat interpolation
convention (GL_FIRST_VERTEX_CONVENTION or GL_LAST_VERTEX_CONVENTION).  It might
be possible to shuffle vertex index to put the last vertex in first position.

- buff[0] = head + 0;
- buff[1] = head + 1;
- buff[2] = head + 2;
+ buff[0] = head + 2;
+ buff[1] = head + 1;
+ buff[2] = head + 0;



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5718 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-14 10:18:38 +00:00
gregory.hainaut 0f603a98d5 gsdx ogl: Test the ARB_shader_subroutine GL4.0 extension
The idea was to replace shader program swith by pointer function calls inside
shaders.  At least parameters that are often changed between draw call. So far
I only ported atst and colclip. Unfortunately code is "slower" (on GSdx standalone).
For the moment keep the code but disabled.

If I understand well the validation of program is done in the "driver thread"
but the additional call are done in the overloaded MTGS thread. Apitrace
profiling shows faster GPU draw calls. Another possibility is that the driver still
need to validate the draw call because of others state change.

Here some stats on colin3 (90 frames):
without subroutine: UseProgram 125246
with subroutine: UseProgram 2906, subroutine 125945 => 3605 extra calls overhead (not
all parameters are ported to subroutine)



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5715 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-10 19:43:59 +00:00
gregory.hainaut c9755361ec gsdx ogl: remove useless ps.rt
Note: I think we can do the same on DX11

Perf wise: on colin mcrae 3 it reduces shader prog setup from 3005 to 2086 each frames. It saves 2 ms of CPU processing (27->29fps)



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5714 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-06 06:48:44 +00:00