Commit Graph

83 Commits

Author SHA1 Message Date
Gregory Hainaut d870188d21 gsdx: sed/o/off/ 2015-05-15 20:40:09 +02:00
Gregory Hainaut 2e34d48e97 gsdx-ogl: add a virtual GetID method for texture
Much more readable
2015-05-12 17:41:41 +02:00
Gregory Hainaut e0012811ae gsdx-debug: more debug info in gl trace 2015-05-12 12:36:34 +02:00
Gregory Hainaut 921fa3bab8 gsdx-ogl/linux: drop fba option
It is useless as DX11
2015-05-11 13:57:26 +02:00
Gregory Hainaut 1523b9534f gsdx-debug: compact the code 2015-05-11 11:19:00 +02:00
Gregory Hainaut 5565544ba6 gsdx-ogl: DATE with texture barrier
Much faster for small batch that write the alpha value. Code can
be enabled with accurate_date option.

Here a summary of all DATE possibilities:
1/ no overlap of primitive
    => texture barrier (pro no setup of stencil and single draw)

2/ alpha written
    => small batch => texture barrier (primitive by primitive). Done in N-primitive draw calls.
    (based on GL_ARB_texture_barrier)

    => bigger batch => compute the first good primitive, slow but only 2 draw calls.
    (based on GL_ARB_shader_image_load_store)

    => Otherwise there is the UserHacks_AlphaStencil but it is a hack!

3/ alpha written
    => full setup of stencil ( 2 draw calls)
2015-05-09 19:54:01 +02:00
Gregory Hainaut f029e4763f gsdx-ogl: add the alpha value in a constant buffer 2015-05-09 15:02:34 +02:00
Gregory Hainaut 472608b879 gsdx-ogl: add accurate blending implementation 2015-05-09 15:02:13 +02:00
Gregory Hainaut 1e8aea033c gsdx-ogl: add a wrapper to control the drawing of primitives
No barrier => draw all primitives
Barrier but without overlap => draw all primitives
Barrier with overlap => draw primitive by primitve

It will ease the implementation of accurate blending and why not date too
2015-05-09 15:01:39 +02:00
Gregory Hainaut 380e420cdd gsdx-ogl: add a blend parameter to shader 2015-05-08 20:28:50 +02:00
Gregory Hainaut 8e1db43431 gsdx-debug: debug stuff 2015-05-08 19:28:17 +02:00
Gregory Hainaut 3c66da4d82 gsdx: remove old code 2015-05-08 19:28:16 +02:00
Gregory Hainaut 1addae1993 gsdx-ogl: don't use extra shader for sprite hack
Atst == 2 && sprite_hack is equivalent to Atst == 1

It frees a bit in the shader selector, and reduces shader combinations.
2015-05-08 19:28:16 +02:00
Gregory Hainaut b490085214 gsdx-ogl: add 2 options in the GUI to ease testing
accurate_date => use an emulated stencil to compute destination alpha
    test (could be slow)

accurate_blend => do nothing (future)
2015-05-08 19:28:16 +02:00
Gregory Hainaut d6448183d7 gsdx-ogl-debug: insert error message in GL stream
This way, you can see your message in the GL debugger
2015-05-08 00:16:31 +02:00
Gregory Hainaut cc4713d379 gsdx-debug: extend ogl debug capabilities
Group opengl calls into a nice name.

Apitrace shows them in a tree format that support folding. Previously it
was a long flat list (10K-40K of lines by frame)

I align the call number with the internal s_n variable. This way it is
easy to map GSdx dump output with the GL debugger :)
2015-05-06 19:09:13 +02:00
Gregory Hainaut 530e4ce776 gsdx-ogl: drop hack that rescale primitive (to avoid upscale glitch)
Rendering is bad. It renders sprites at native resolution.
2015-05-06 19:09:13 +02:00
Gregory Hainaut 6d65867b26 gsdx-ogl: comment point_sampler
It is not enabled on the shader so I will reuse the bit
2015-05-06 19:09:12 +02:00
Gregory Hainaut 3fcac07120 gsdx-ogl: print some message when blending is not properly supported 2015-05-06 19:09:12 +02:00
Gregory Hainaut 8032e2c369 gsdx-ogl: separate color mask state from the blending state
Unlike DX they're uncorrelated.
2015-05-05 10:26:01 +02:00
Gregory Hainaut 73d04e33e9 gsdx ogl: clean various comment and old code 2015-05-01 20:04:23 +02:00
Gregory Hainaut 335695bd0e purge GLES from GSdx !
mobile will use vulkan (or any new API) anyway
2015-05-01 20:02:17 +02:00
Gregory Hainaut 7367b22e03 gsdx-ogl: reduce toggling of scissor state for DATE 2015-05-01 00:59:49 +02:00
Gregory Hainaut baf84b98c4 gsdx-ogl: add override_GL_ARB_texture_barrier option
To ease regression test.
2015-04-25 09:59:25 +02:00
Gregory Hainaut c207632e49 gsdx-ogl: improve date performance for GL45
If there is no overlap, it is allowed to directly read from the render target.

On SotC testcase with 6x scaling: 30fps -> 40fps

Note: it requires GL_ARB_texture_barrier extension so be sure to have a recent driver

Note2: it requires a lots of testing too

Open question: in case of complex date (written alpha)
Will it be faster to split the draw call into multiple call with no
primitive overlap
2015-04-24 21:12:33 +02:00
Gregory Hainaut 795ae50ecd gsdx-ogl: fix the recently broken advance date feature
Now it is really working with a 2 stages shaders but it is still slow.
2015-04-24 20:13:38 +02:00
Gregory Hainaut 672e3f9533 gsdx-ogl: use DSA for texture management
Yeah code is much nicer :)
2015-04-24 19:34:17 +02:00
Gregory Hainaut 6e386df535 gsdx-ogl: avoid to clean fully texture in DATE
Is is useless and it has a small impact on performance for big upscale
2015-04-24 18:32:08 +02:00
Gregory Hainaut bd6ea17bdc gsdx-ogl: speed improvement for DATE
DATE is implemented in 2 ways.
1/ with stencil
2/ purely in FS (sw)

I kept method 1 to reduce the work on method 2. It sucks for performance.
So it would be either 1 or 2.

Note: DATE has a big impact on higher upscaling
Note2: you can disable the 2nd method with this configuration parameter

override_GL_ARB_shader_image_load_store = 0
2015-04-22 00:36:34 +02:00
Gregory Hainaut 31f8c065db gsdx-ogl: implement a new hack UserHacks_UnscaleSprite for opengl
UserHacks_UnscaleSprite = 1 will unscale flat sprites
UserHacks_UnscaleSprite = 2 will unscale all  sprites (don't work well so far)

The idea of the hack is to redo the interpolation of texture coordinate
based on the non-upscaled pixel position.

It avoids various glitches but sprites aren't upscaled anymore (so no
more anti-aliasing, potentially a coefficient can be added).
2015-04-20 07:18:08 +02:00
Gregory Hainaut 6124eb844e gsdx-ogl: only compile useful VS
logz is a constant
wildhack is only compatbile with TME/FST

Compilation goes down from 64 to 20 vertex shaders.
2015-04-20 07:17:58 +02:00
Gregory Hainaut 65a84b9f64 gsdx hack: drop previous filtering hack
superseeded by UserHacks_round_sprite_offset
2015-04-15 09:26:01 +02:00
Gregory Hainaut 418f2e69a8 gsdx-ogl: implement the wildhack on the GPU
Likely much faster for opengl and much easier to implement

Note: hopefully UserHacks_round_sprite_offset will replace it
2015-04-13 22:14:36 +02:00
Gregory Hainaut 98d8ad7484 gsdx: new anti-glitches upscaling hack: UserHacks_round_sprite_offset
It is replacement of the previous hack (UserHacks_stretch_sprite). Don't enable both in the same time!

The idea of the hack is to move the sprite to the pixel boundary. It
avoids most of rounding issue. It also rescales verticaly the sprite (avoid horizontal line on ace combat).
I don't like this rescaling maybe we can limit it to only 1 pixels.

On my limited testcase, results are much better with any upscaling factor.

I still have a bad line in Kingdom heart. If you have issue with others
game please provide us a GS dump.
2015-04-07 19:18:24 +02:00
Gregory Hainaut 5269e54f02 gsdx: tune previous hack
Only disable bilinear on the sprite that were forced by the user.
If the PS2 requires a bilinear filtering, there is likely a big enough texture
2015-04-03 20:09:02 +02:00
Gregory Hainaut e1a5736583 gsdx: anti-upscale-glitch hack UserHacks_SkipDraw
2x upscaling is pixel perfects. Bigger upscaling is better but not yet perfect

Feedbacks are welcomes (note it doesn't solve all upscaling issue, only wrong texture sampling)

For the history:
If you have a texture of [0;16[ texels and draws a primitive [0;16[

The formulae to sample last pixels of texture is
0.5 + (16*s-1)/(16*s) * 16
Native (s==1): 15.5 (good)
2x     (s==2): 16 (bad, outside of the texture)
4x     (s==4): 16.25 (bad, really outside of the texure))
2015-04-03 18:33:05 +02:00
Gregory Hainaut ccc1137e12 gsdx-ogl: merge the two vertex buffer format
* Only a single VAO
    => Format is set once
    => Only a single bind at startup
    => GSVertexBufferStateOGL is nearly useless
    => barely faster but better than nothing :)
2014-10-02 20:44:22 +02:00
Gregory Hainaut 10c7be8c50 gsdx-ogl: Use 32B strides for all VBO 2014-10-02 20:44:22 +02:00
Gregory Hainaut 0d45e6d70e gsdx-ogl: avoid to send constant to the GPU
It was a waste of bandwith
2014-04-06 10:44:40 +02:00
Gregory Hainaut b020bd76c6 gsdx-ogl: restore gles build
Add the --gles build option to the linux main script
Ifdef all gl code not supported on gles3 (note some will be reenabled for gles3.1)

Note: it probably doesn't run anymore. My Nvidia driver doesn't support
yet egl/gles so I can't test it. Feel free to contribute.
2014-03-29 11:55:02 +01:00
gregory.hainaut 1922598f0f the "don't let users shoot themself in the foot" commit
UserHacks_AlphaStencil will take precedence on override_GL_ARB_shader_image_load_store


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5891 96395faa-99c1-11dd-bbfe-3dabce05a288
2014-02-08 11:22:51 +00:00
gregory.hainaut 3d7b86660a gsdx ogl: Enable by default advance DATE effect on nvidia (GL4 GPU)
Note: DATE is used for shadows (persona 3) and others effect

You can disable it when you disable gl image extension (override_GL_ARB_shader_image_load_store = 0)
You can also enable it on AMD too (set it to 1 this time) but no guarantee.

If you feel the extension is too slow, you can try disable some gl barrier (aka "damn the torpedo full steam ahead!").
It can be done with the option UserHacks_DateGL4 = 1. Otherwise just disable the extension.

Note: don't enable UserHacks_AlphaStencil in the same time. GL_ARB_shader_image_load_store is an alternative implementation.
Enabling both in the same time will lead to undefined (well surely wrong) behavior.




git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5890 96395faa-99c1-11dd-bbfe-3dabce05a288
2014-02-08 11:05:26 +00:00
gregory.hainaut 6cf75e086b gsdx ogl: Only enable advance DATE when user request it
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5864 96395faa-99c1-11dd-bbfe-3dabce05a288
2014-02-01 18:46:18 +00:00
gregory.hainaut e80b002929 gsdx ogl: Flush various pending work
* try to use more subroutine on VS&PS, unfortunately hit a driver crash!
* Call Attach/DetachContext through GSDevice so I can unmap currently mapped buffer
* Implement glsl part of GL_ARB_bindless texture, again hit another driver crash!
* various fix of GL_ARB_buffer_storage. Basic benchmark show only improvement on 'cold' case, I guess it will improve smoothness
* try to fix GL_clear_texture, no success so far. It seem the extension is limited to basic texture (aka no depth/stencil)



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5752 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-10-24 20:54:27 +00:00
gregory.hainaut e01c6cd9ce gsdx ogl: the proof of concept commit
* GL_ARB_shader_subroutine for perf
fix for nvidia => add missing shader declaration. Nvidia got +4fps on colin3 :) 

For the moment only 2 PS parameters are supported. Code need to be extended to support others games that often
switch shader program (like xenosaga).
require GL4 class hardware and the option override_GL_ARB_shader_subroutine = 1
Note: strangely on AMD linux it is slower!

* GL_ARB_shader_image_load_store for accuraccy (Date)
Use a signed integer texture and reenable color buffer writing

Current status: Amagami_transparency.gs & P3_battle_shadows.gs are now working on Nvidia with a small perf impact.
Current implementation detail:
1/ setup the standard stencil as before
2/ on remaining pixel, draw once to compute first primitive that will write a fail alpha value.
3/ final draw based on primitive id of step 2

Note: I think we would get a bad behavior if depth test&mask are enabled on step 2/3
Note2: on my limited testcase the perf impact was on CPU. It would be possible to merge step1&2 to nullifying it (could 
even be faster actually), however it would require more GPU power.

Again require GL4 class hardware. And the option UserHacks_DateGL4 = 1



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5725 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-28 08:44:16 +00:00
gregory.hainaut 07605941ef gsdx ogl:
* some preliminary work to test/benchmark bindless texture in the future (glsl was not yet updated)
Bindless texture allow to get a GPU texture pointer and then set it directly
to the shader as a basic uniform.
=> no more texture unit selection/validation
=> no more texture validation neither texture hash lookup

3rdparty: update gl header to the latest gl4.4


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5720 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-17 08:57:52 +00:00
gregory.hainaut b4084047be gsdx ogl: Used a basic flat interpolation for color interpolation (line & tri primitives)
Card that support gs:
remain only a gs to generate sprite from a line. Even dummy gs are costly for the GPU.

Card that don't support gs:
remove useless copy of color for line and triangle primitives

Note for dx: opengl 3.2 (maybe not gles) supports both flat interpolation
convention (GL_FIRST_VERTEX_CONVENTION or GL_LAST_VERTEX_CONVENTION).  It might
be possible to shuffle vertex index to put the last vertex in first position.

- buff[0] = head + 0;
- buff[1] = head + 1;
- buff[2] = head + 2;
+ buff[0] = head + 2;
+ buff[1] = head + 1;
+ buff[2] = head + 0;



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5718 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-14 10:18:38 +00:00
gregory.hainaut 0f603a98d5 gsdx ogl: Test the ARB_shader_subroutine GL4.0 extension
The idea was to replace shader program swith by pointer function calls inside
shaders.  At least parameters that are often changed between draw call. So far
I only ported atst and colclip. Unfortunately code is "slower" (on GSdx standalone).
For the moment keep the code but disabled.

If I understand well the validation of program is done in the "driver thread"
but the additional call are done in the overloaded MTGS thread. Apitrace
profiling shows faster GPU draw calls. Another possibility is that the driver still
need to validate the draw call because of others state change.

Here some stats on colin3 (90 frames):
without subroutine: UseProgram 125246
with subroutine: UseProgram 2906, subroutine 125945 => 3605 extra calls overhead (not
all parameters are ported to subroutine)



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5715 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-10 19:43:59 +00:00
gregory.hainaut c9755361ec gsdx ogl: remove useless ps.rt
Note: I think we can do the same on DX11

Perf wise: on colin mcrae 3 it reduces shader prog setup from 3005 to 2086 each frames. It saves 2 ms of CPU processing (27->29fps)



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5714 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-06 06:48:44 +00:00
gregory.hainaut a46b489a24 gsdx ogl: various minor optimization.
* move most of gl states into a separate namespace. Extend it to depth/stencil/blend micro state
=> save 10,000 opengl call by frame for colin mcrae 3
* Only setup blend state of first drawbuffer
* Don't request anymore a debug context on dev/release build



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5713 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-05 20:25:25 +00:00