Commit Graph

320 Commits

Author SHA1 Message Date
Gregory Hainaut 58ce7d4bb8 gsdx-ogl: emulate texture shuffle
GS doesn't supports texture shuffle/swizzle so it is emulated in a
complex way.

The idea is to read/write the 32 bits color format as a 16 bit format.
This way, RG (16 lsb bits) or BA (16 msb bits) can be read or written with
square texture that targets pixels 1-8 or pixels 8-16.
However shuffle is limited. For example you can copy the green channel
to either the alpha channel or another green channel.

Note: Partial masking of channel is not yet implemented

V2: improve logging
V3: better support of green channel in shader
V4: improve detection of destination (issue due to rounding)
2015-07-01 09:30:20 +02:00
Gregory Hainaut 35081f922a gsdx: GS kinds of support draw without framebuffer
Gow uses 24 bits buffer, so only color is updated but blending is configured as Cd
so it is a NOP

In this case, we don't lookup the target in the texture cache. It reduces the complexity
to handle depth which can be located at same address as RT

Note: please test DX renderer
2015-07-01 09:30:20 +02:00
Gregory Hainaut 5bf5b5bca4 gsdx-ogl: extend StretchRect to write in depth texture
It will allow to convert color texture to depth texture
2015-06-07 12:39:23 +02:00
Gregory Hainaut 2cbde89084 Merge pull request #555 from PCSX2/real-fb-format
GSdx: better framebuffer format
2015-06-01 11:48:07 +02:00
Gregory Hainaut f81cf360bc Merge pull request #545 from PCSX2/gsdx-half-screen-snow-engine
Gsdx half screen (most of) snow engine games
2015-06-01 11:47:40 +02:00
Gregory Hainaut 9fa473a57d gsdx-ogl: glDebugMessageCallback is optional
Not even used in release build
2015-05-31 17:55:34 +02:00
Gregory Hainaut 92d68b70d3 gsdx-ogl: add a performance note for later
The idea will be to replace StretchRect for standard case with a framebuffer
blit. Potentially it toggles less gl state.

Worth a test on Star Ocean 3 that uses a lots this function for stencil emulation
2015-05-30 09:58:46 +02:00
Gregory Hainaut 02274601b3 gsdx-ogl: always copy texture into 0,0 2015-05-29 12:18:54 +02:00
Gregory Hainaut d31bd97d59 gsdx-ogl: add a variable to select FB output
Either 32bits/24bits/16bits
2015-05-26 14:59:07 +02:00
Gregory Hainaut 3be5a6036b gsdx-ogl: remove assertion + debug message
opengl use a different object to compute the vertex count
2015-05-26 11:03:59 +02:00
Gregory Hainaut b0af54d33e gsdx-ogl: better support of palette
The purpose of the code is to support alpha channel
of RT uses as an index for a palette texure.

I'm afraid that code will likely break pure palette texture. Only used
if paltex is enabled

It fixes missing shadow in Star Ocean 3 (issue #374) in Native resolution
with filter = 0  (no filtering) or = 2 (normal fitering)

Rendering explanation:
The game emulates a stencil buffer with the alpha channel

The alpha channel of the RT can contains a palette texture index (format 4HH)
The idea is to have a gradient of value in the palette (16/32/48/...).
This way you can implement a +16/-16 and even wrap the alpha value every time
you hit the pixel.

Bilinear filtering breaks the rendering because it interpolates between counts
so you doesn't have the exact count

Upscaling breaks the rendering because the RT is reused as an input texture. It means
that we need to scale it down which again create some interpolations.
2015-05-24 18:07:16 +02:00
Gregory Hainaut 7609fdc576 gsdx: rename too much sr to sRect 2015-05-21 09:48:15 +02:00
Gregory Hainaut 2d54d59add gsdx-ogl: update assertion ps sel is 8B now 2015-05-20 08:40:08 +02:00
Gregory Hainaut d3d5a436ea gsdx-ogl: add code to read back depth texture 2015-05-20 08:07:40 +02:00
Gregory Hainaut a4c74ef872 gsdx-ogl: disable GL_DITHER
I really don't know the impact but it is not supported with integer texture.
2015-05-20 08:05:27 +02:00
Gregory Hainaut 8d3e3e6c5b gsdx-ogl: more blend rework to support accurate_colclip
So far few blending equations are implemented in PS. It is only
for test the behavior on GoW
2015-05-20 08:00:40 +02:00
Gregory Hainaut c5341a2711 gsdx-ogl: update blending management
This way it will allow to implement all blendings operartion in FS.

Of course it will be slow, but it would be nice for debug and quickly check
game error rendering.
2015-05-20 00:12:52 +02:00
Gregory Hainaut 1837001e75 gsdx: extend CopyOffscreen with a new shader parameter
Currently we're trying to infer the conversion shader based on the output format

It only works if the input data is RGBA8. It might not be true in the future
2015-05-19 13:14:18 +02:00
Gregory Hainaut 19d9349b0b gsdx-debug: remove old assert 2015-05-18 16:45:38 +02:00
Gregory Hainaut 79a9254894 gsdx-ogl: print some error messages if extenal shader is wrongly configured 2015-05-18 10:56:32 +02:00
Gregory Hainaut 5cfb496700 gsdx-ogl: only open debug file once 2015-05-17 14:43:56 +02:00
Gregory Hainaut b1ea081fc3 gsdx-debug: improve tracing interface
Basically move the format and c_str() in the macro
2015-05-17 13:05:08 +02:00
Gregory Hainaut c567198967 gsdx-ogl: replaced draw_count by s_n
This way error message is aligned with everything else :)

It is not perfect (normally it must be done in start of main draw)
2015-05-16 13:59:13 +02:00
Gregory Hainaut 62e0e6a067 gsdx: remove deprecated code
Core was fixed to call GSFifo in the good thread
2015-05-16 13:52:16 +02:00
Gregory Hainaut 3f278382a1 gsdx-debug: dump all drawing register
Thanks PERL
2015-05-16 11:16:33 +02:00
Gregory Hainaut 28bb64aae8 gsdx: sed/dr/dRect/ 2015-05-15 20:49:32 +02:00
Gregory Hainaut 445c28fe97 gsdx: sed/sr/sRect/ 2015-05-15 20:47:14 +02:00
Gregory Hainaut d566bb2a23 gsdx: sed /st/sTex/ 2015-05-15 20:45:31 +02:00
Gregory Hainaut 6a9e425308 gsdx: sed /dt/dTex/ 2015-05-15 20:44:15 +02:00
Gregory Hainaut a5e424512c gsdx-ogl: really avoid consecutive clean 2015-05-15 16:00:46 +02:00
Gregory Hainaut 84c3592fbe gsdx-debug: more debug message/group 2015-05-15 16:00:45 +02:00
Gregory Hainaut 08291aed0c gsdx-ogl: color state impact the clean command 2015-05-15 15:25:45 +02:00
Gregory Hainaut b7a9465963 gsdx-ogl: update the device to use the new texture flags
Hopefully it will increase a bit the speed
2015-05-12 18:18:20 +02:00
Gregory Hainaut 2e34d48e97 gsdx-ogl: add a virtual GetID method for texture
Much more readable
2015-05-12 17:41:41 +02:00
Gregory Hainaut e0012811ae gsdx-debug: more debug info in gl trace 2015-05-12 12:36:34 +02:00
Gregory Hainaut 4e222f18cd gsdx-ogl: it was a bad idea to use DSA on fb
Actually I'm not sure we can mix both dsa/standard approach
2015-05-11 18:04:16 +02:00
Gregory Hainaut f37f3cb3cf gsdx-ogl: improve texture uploading
Initially we copy pitch by line in the PBO and tell the dma to only
use the first valid byte.

Now, we only copy useful data to the PBO. It reduce the copy and PBO memory requirement.

It seems a bit faster on native resolution
2015-05-11 16:32:13 +02:00
Gregory Hainaut 4e2e9aa56c gsdx-ogl: always read the first attachment of the fbo 2015-05-11 16:28:34 +02:00
Gregory Hainaut 1523b9534f gsdx-debug: compact the code 2015-05-11 11:19:00 +02:00
Gregory Hainaut 625d65d4b4 gsdx-ogl: encode the bogus id as shader parameter 2015-05-09 14:55:44 +02:00
Gregory Hainaut 380e420cdd gsdx-ogl: add a blend parameter to shader 2015-05-08 20:28:50 +02:00
Gregory Hainaut 8e1db43431 gsdx-debug: debug stuff 2015-05-08 19:28:17 +02:00
Gregory Hainaut 1addae1993 gsdx-ogl: don't use extra shader for sprite hack
Atst == 2 && sprite_hack is equivalent to Atst == 1

It frees a bit in the shader selector, and reduces shader combinations.
2015-05-08 19:28:16 +02:00
Gregory Hainaut e87d129b09 gsdx-ogl: more verbose debug 2015-05-08 01:01:01 +02:00
Gregory Hainaut d6448183d7 gsdx-ogl-debug: insert error message in GL stream
This way, you can see your message in the GL debugger
2015-05-08 00:16:31 +02:00
Gregory Hainaut 51a67029cf gsdx-ogl: add an option to print gl error messages 2015-05-07 23:54:22 +02:00
Gregory Hainaut 7518b2ef21 gsdx-gui-linux: add debug option in the gui
* Only for debug/dev build
* look awful (expand/fill) but otherwise it is nice for the debug ;)
2015-05-07 23:54:22 +02:00
Gregory Hainaut ba21879059 gsdx-ogl: 1x aniso <=> off 2015-05-07 22:13:49 +02:00
Gregory Hainaut 6095f40baf gsdx-ogl: add the number of free bit in selector structs 2015-05-07 18:41:10 +02:00
Gregory Hainaut cc4713d379 gsdx-debug: extend ogl debug capabilities
Group opengl calls into a nice name.

Apitrace shows them in a tree format that support folding. Previously it
was a long flat list (10K-40K of lines by frame)

I align the call number with the internal s_n variable. This way it is
easy to map GSdx dump output with the GL debugger :)
2015-05-06 19:09:13 +02:00
Gregory Hainaut 530e4ce776 gsdx-ogl: drop hack that rescale primitive (to avoid upscale glitch)
Rendering is bad. It renders sprites at native resolution.
2015-05-06 19:09:13 +02:00
Gregory Hainaut 6d65867b26 gsdx-ogl: comment point_sampler
It is not enabled on the shader so I will reuse the bit
2015-05-06 19:09:12 +02:00
Gregory Hainaut 5f5b901bca gsdx-ogl: alpha bending equation/function are constant
Drop useless variables/state checking
2015-05-05 11:20:25 +02:00
Gregory Hainaut 8032e2c369 gsdx-ogl: separate color mask state from the blending state
Unlike DX they're uncorrelated.
2015-05-05 10:26:01 +02:00
Gregory Hainaut 14a1925de0 gsdx ogl: date texture is signed to use i variant 2015-05-02 16:53:34 +02:00
Gregory Hainaut f37ef105c5 gsdx-ogl: add support for anisotropy
Close feature request #447
2015-05-02 10:54:58 +02:00
Gregory Hainaut 73d04e33e9 gsdx ogl: clean various comment and old code 2015-05-01 20:04:23 +02:00
Gregory Hainaut 335695bd0e purge GLES from GSdx !
mobile will use vulkan (or any new API) anyway
2015-05-01 20:02:17 +02:00
Gregory Hainaut 7367b22e03 gsdx-ogl: reduce toggling of scissor state for DATE 2015-05-01 00:59:49 +02:00
Gregory Hainaut b65a62096f gsdx-ogl: drop support of ENABLE_OGL_DEBUG
Stencil can be read by GL debugger due to correct mask configuration
2015-04-30 20:02:51 +02:00
Gregory Hainaut ee19a2789c gsdx: move invalidation from GSDevice to GSTexture
Much cleaner this way
2015-04-30 19:55:57 +02:00
Gregory Hainaut 2bd9043657 gsdx-ogl: improve debug of stencil
Note: ENABLE_OGL_STENCIL_DEBUG could be dropped now that I can read the stencil properly
2015-04-30 19:55:57 +02:00
Gregory Hainaut f0181d98fd gsdx-ogl: save the texture state 2015-04-30 09:57:30 +02:00
Gregory Hainaut 0ab0c6cfba gsdx: verbose debug option
Print opengl error message on stderr

Rename Debug.txt into GSdx_opengl_debug.txt
2015-04-27 19:30:03 +02:00
Gregory Hainaut 46ff4dc3d3 gsdx-ogl: hardware unit only support normalization of 4 bytes...
(At least on recent AMD GPU)
2015-04-27 18:51:59 +02:00
Gregory Hainaut ee244071fa gsdx-ogl: use 64 bits counter + fix division factor
I also added a counter of the real size of the texture.

I have a bad overhead for pbo transfer
2015-04-25 14:18:21 +02:00
Gregory Hainaut 47a0026b60 gsdx-ogl: print the bandwidth of uniform 2015-04-25 13:00:03 +02:00
Gregory Hainaut 757726bb91 gsdx-ogl: allow to invalidate the texture
It just a hint to the driver to avoid any useless transfer

I don't expect any change but it is free so why not ;)
2015-04-25 12:50:12 +02:00
Gregory Hainaut 36514bd95f glsl: fog is a single byte
Give a chance to the driver to optimize if possible
2015-04-24 21:37:37 +02:00
Gregory Hainaut c207632e49 gsdx-ogl: improve date performance for GL45
If there is no overlap, it is allowed to directly read from the render target.

On SotC testcase with 6x scaling: 30fps -> 40fps

Note: it requires GL_ARB_texture_barrier extension so be sure to have a recent driver

Note2: it requires a lots of testing too

Open question: in case of complex date (written alpha)
Will it be faster to split the draw call into multiple call with no
primitive overlap
2015-04-24 21:12:33 +02:00
Gregory Hainaut 795ae50ecd gsdx-ogl: fix the recently broken advance date feature
Now it is really working with a 2 stages shaders but it is still slow.
2015-04-24 20:13:38 +02:00
Gregory Hainaut 672e3f9533 gsdx-ogl: use DSA for texture management
Yeah code is much nicer :)
2015-04-24 19:34:17 +02:00
Gregory Hainaut 6e386df535 gsdx-ogl: avoid to clean fully texture in DATE
Is is useless and it has a small impact on performance for big upscale
2015-04-24 18:32:08 +02:00
Gregory Hainaut 03e72781aa gsdx-ogl: drop support of GL_ARB_clear_texture extension
Extension is a bit slower.

We use it to clear the RT but we generally use it right away so
we don't avoid the FB attachment.
2015-04-24 18:15:58 +02:00
Gregory Hainaut 19eb1f00d1 gsdx ogl: flush vbo range instead of barrier
For testing purpose. I don't know which one is better.

It seems flushing have less fps fluctuation than barrier.
2015-04-21 21:44:50 +02:00
Gregory Hainaut ce98276322 gsdx-ogl: improve speed of vertex streaming
Note yet enabled because I'm afraid of data corruption but feel free to test it

The option:
ogl_vertex_storage = 1

Performance note (warm cache+gs replay on colin3)
60 fps -> 76 fps
2015-04-20 09:38:03 +02:00
Gregory Hainaut 62489f42f1 gsdx-ogl: add an optimization note for later
Only 1 byte of fog is useful
2015-04-20 07:18:09 +02:00
Gregory Hainaut 31f8c065db gsdx-ogl: implement a new hack UserHacks_UnscaleSprite for opengl
UserHacks_UnscaleSprite = 1 will unscale flat sprites
UserHacks_UnscaleSprite = 2 will unscale all  sprites (don't work well so far)

The idea of the hack is to redo the interpolation of texture coordinate
based on the non-upscaled pixel position.

It avoids various glitches but sprites aren't upscaled anymore (so no
more anti-aliasing, potentially a coefficient can be added).
2015-04-20 07:18:08 +02:00
Gregory Hainaut 6124eb844e gsdx-ogl: only compile useful VS
logz is a constant
wildhack is only compatbile with TME/FST

Compilation goes down from 64 to 20 vertex shaders.
2015-04-20 07:17:58 +02:00
Gregory Hainaut 15264c6c63 glsl: split the main shader
* separate VS/GS and FS
* separate subroutine part of the FS

It already complex enough without subroutine stuff. Besides I'm not sure
we will keep subroutine on the future.
2015-04-19 18:49:02 +02:00
Gregory Hainaut 418f2e69a8 gsdx-ogl: implement the wildhack on the GPU
Likely much faster for opengl and much easier to implement

Note: hopefully UserHacks_round_sprite_offset will replace it
2015-04-13 22:14:36 +02:00
Gregory Hainaut 8c90e7cafc gsdx-ogl: support latest fxaa version
Only tested on Nvidia, please report any issue with your driver

Note: requires GL4 GPU
2014-11-10 10:39:55 +01:00
Gregory Hainaut ff39dffe23 gsdx-ogl: add a gui option (linux) to select external shader
Note: of course it requires a glsl shader ;)

On windows, you can set the path on the ini file. Here an example with linux path:
shaderfx_conf = /home/gregory/playstation/emulateur/pcsx2_merge/bin/GSdx_FX_Settings.ini
shaderfx_glsl = /home/gregory/playstation/emulateur/pcsx2_merge/bin/shader.fx
2014-11-10 10:38:52 +01:00
Gregory Hainaut 920ac6695f gsdx-ogl: add preliminary support of external shader fx 2014-11-10 10:37:58 +01:00
Gregory Hainaut e62af05496 gsdx-ogl: reduce complexity of clear texture
Null is equivalent to a clear to 0.

Note: Code is not yet used because both stencil and depth are cleared.
Future note: stencil can potentially be replaced by load_store_image
2014-11-08 21:30:14 +01:00
Tom Burnett 1f734a69a0 Small VS2013 fixes 2014-10-12 01:40:40 -07:00
Gregory Hainaut ccc1137e12 gsdx-ogl: merge the two vertex buffer format
* Only a single VAO
    => Format is set once
    => Only a single bind at startup
    => GSVertexBufferStateOGL is nearly useless
    => barely faster but better than nothing :)
2014-10-02 20:44:22 +02:00
Gregory Hainaut 10c7be8c50 gsdx-ogl: Use 32B strides for all VBO 2014-10-02 20:44:22 +02:00
Gregory Hainaut 79e8a912cd gsdx-ogl: keep the draw buffer enabled by default
Note: Only DATE requires to disable the draw buffer
2014-09-30 22:18:20 +02:00
Gregory Hainaut f46e8cc6ac gsdx-ogl: bump base requirement to 3.3
A couple of fallbacks were introduced for the Mesa driver that only support 3.0
DSA will require a recent Mesa which already support GL3.3

Require at least SandyBridge for Intel GPU
2014-09-30 22:18:20 +02:00
Gregory Hainaut 594f6c33a2 gsdx-ogl-ES: require GL_EXT_shader_io_blocks + GLES3.1
Allow to use same shader interface for all API

Note: on the GL API it will require GL3.3 (see next commit)
2014-09-30 22:18:01 +02:00
Gregory Hainaut 8833afc2fa gsdx-ogl: drop GL_ARB_multi_bind
It will be replaced by DSA so let's reduce the complexity of opengl
2014-09-28 12:23:44 +02:00
Gregory Hainaut 1e86e3cb08 gsdx-ogl: rework callback debug
* use DebugOutputToFile as a callback of gl error. Add a breakpoint to
  find the culprit GL call
* use string instead of char[n]

Note: CheckDebugLog is potentially useless now
2014-09-28 12:00:34 +02:00
Gregory Hainaut 9d8d702aa6 gsdx-ogl: drop GL_NV_depth_clamp extension
superseeded by GL_ARB_clip_control
2014-09-28 12:00:34 +02:00
Gregory Hainaut 4659184cc1 gsdx-ogl: add support of clip_control (depth only)
* replace the [-1;1] depth range of openGL with the DX range [0;1].
2014-09-28 12:00:34 +02:00
Gregory Hainaut cc24da128c gsdx-ogl: fix for gl_clear_texture
Note: Disabled for depth_stencil texture (I'm not sure we can split the two)
2014-09-28 12:00:34 +02:00
Gregory Hainaut 58a8683d7d gsdx-ogl: disable texture compare mode
It seems to be used for depth texture
2014-09-22 09:27:34 +02:00
Gregory Hainaut d51f008c72 gsdx: openglES fix
* require a 3.1 context
* unattach texture of the fbo when they're not used
(avoid to have a texture and depth_stencil with different size)

Note: except minor shader bug it works on Nvidia 340.23.01
2014-09-22 09:27:31 +02:00
Gregory Hainaut 0d45e6d70e gsdx-ogl: avoid to send constant to the GPU
It was a waste of bandwith
2014-04-06 10:44:40 +02:00
Gregory Hainaut b020bd76c6 gsdx-ogl: restore gles build
Add the --gles build option to the linux main script
Ifdef all gl code not supported on gles3 (note some will be reenabled for gles3.1)

Note: it probably doesn't run anymore. My Nvidia driver doesn't support
yet egl/gles so I can't test it. Feel free to contribute.
2014-03-29 11:55:02 +01:00
Gregory Hainaut 8b78551b92 gsdx-ogl: improve debugging capabilities
allow to print memory transfer usage
Check gl call in dev build
2014-03-25 16:36:29 +01:00
Gregory Hainaut 41091f8ebf gsdx ogl: remove multithread hack
This hack was used because GSReadFifo was called from the EE thread.
Previous commit move the call to the GSThread.

Hopefully avoid flushing the full GPU contex would improve openGL
performance (at least avoid some hiccups ;) )

Note: newer GSdx ogl won't be compatible with older PCSX2
2014-03-25 16:36:29 +01:00
gregory.hainaut 384c0c12ea gsdx ogl:
* properly detect gl nv depth extension
* Always show the hack on the gui. Add a new hack option for DATE (gl4.2) only
* Save the scan mode on linux too (f7)
* hopefully fix some crash on some drivers... (ensure aligment 256 bits alignment, and if not use std memcpy)


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5888 96395faa-99c1-11dd-bbfe-3dabce05a288
2014-02-07 19:53:01 +00:00
gregory.hainaut c0558c00e7 gsdx ogl:
* gui refresh
  + Use some tab to reduce heigth for small screen
  + Add logz option
  + remove broken/experimental keyword. GSdx ogl is not too bad ;)
* autodetect GL_NV_depth_buffer_float

Linux tester you are welcome!



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5862 96395faa-99c1-11dd-bbfe-3dabce05a288
2014-02-01 11:11:14 +00:00
gregory.hainaut 84895eadd9 gsdx ogl: correct most of Z-depth issue
Best setting if you driver support GL_NV_depth_buffer_float => GL_NV_Depth = 1 & logz = 0
Otherwise => GL_NV_Depth = 0 & logz = 1

Explanation of the bug:
Dx z position ranges from 0.0f to 1.0f (FS ranges 0.0f to 1.0f)
GL z Position ranges from -1.0f to 1.0f (FS ranges 0.0f to 1.0f)

Why it sucks:
GS small depth value will be "mapped" to -1.0f. In others all small values will be 1.0f! Terrible lost
of accuraccy.

The GL_NV_depth_buffer_float extension allow to set the near plane as -1.0f.
So
"GL z Position ranges from -1.0f to 1.0f (FS ranges 0.0f to 1.0f)"
will become
"GL z Position ranges from -1.0f to 1.0f (FS ranges -1.0f to 1.0f)"
and therefore
"z posision [0.0f;1.0f] will map to FS [0.0f;1.0f]" as DX

Yes we just get back all precision lost previously :)
However you need hardware (intel?) and driver support (free driver?/gles?) :(





git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5860 96395faa-99c1-11dd-bbfe-3dabce05a288
2014-01-31 21:30:54 +00:00
gregory.hainaut c2aa4ff3fd gsdx ogl:
* restore the old fxaa (Asmodeam will be integrated when I got time)
* port the recently added new scanline algo


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5818 96395faa-99c1-11dd-bbfe-3dabce05a288
2014-01-18 14:46:13 +00:00
gregory.hainaut 2238095a82 gsdx ogl: do the same as previous commit but for ogl ;)
* fix the missing auto interlace opt when cycling with hotkey on linux


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5810 96395faa-99c1-11dd-bbfe-3dabce05a288
2014-01-12 11:38:50 +00:00
gregory.hainaut e80b002929 gsdx ogl: Flush various pending work
* try to use more subroutine on VS&PS, unfortunately hit a driver crash!
* Call Attach/DetachContext through GSDevice so I can unmap currently mapped buffer
* Implement glsl part of GL_ARB_bindless texture, again hit another driver crash!
* various fix of GL_ARB_buffer_storage. Basic benchmark show only improvement on 'cold' case, I guess it will improve smoothness
* try to fix GL_clear_texture, no success so far. It seem the extension is limited to basic texture (aka no depth/stencil)



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5752 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-10-24 20:54:27 +00:00
gregory.hainaut e01c6cd9ce gsdx ogl: the proof of concept commit
* GL_ARB_shader_subroutine for perf
fix for nvidia => add missing shader declaration. Nvidia got +4fps on colin3 :) 

For the moment only 2 PS parameters are supported. Code need to be extended to support others games that often
switch shader program (like xenosaga).
require GL4 class hardware and the option override_GL_ARB_shader_subroutine = 1
Note: strangely on AMD linux it is slower!

* GL_ARB_shader_image_load_store for accuraccy (Date)
Use a signed integer texture and reenable color buffer writing

Current status: Amagami_transparency.gs & P3_battle_shadows.gs are now working on Nvidia with a small perf impact.
Current implementation detail:
1/ setup the standard stencil as before
2/ on remaining pixel, draw once to compute first primitive that will write a fail alpha value.
3/ final draw based on primitive id of step 2

Note: I think we would get a bad behavior if depth test&mask are enabled on step 2/3
Note2: on my limited testcase the perf impact was on CPU. It would be possible to merge step1&2 to nullifying it (could 
even be faster actually), however it would require more GPU power.

Again require GL4 class hardware. And the option UserHacks_DateGL4 = 1



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5725 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-28 08:44:16 +00:00
gregory.hainaut 07605941ef gsdx ogl:
* some preliminary work to test/benchmark bindless texture in the future (glsl was not yet updated)
Bindless texture allow to get a GPU texture pointer and then set it directly
to the shader as a basic uniform.
=> no more texture unit selection/validation
=> no more texture validation neither texture hash lookup

3rdparty: update gl header to the latest gl4.4


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5720 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-17 08:57:52 +00:00
gregory.hainaut b4084047be gsdx ogl: Used a basic flat interpolation for color interpolation (line & tri primitives)
Card that support gs:
remain only a gs to generate sprite from a line. Even dummy gs are costly for the GPU.

Card that don't support gs:
remove useless copy of color for line and triangle primitives

Note for dx: opengl 3.2 (maybe not gles) supports both flat interpolation
convention (GL_FIRST_VERTEX_CONVENTION or GL_LAST_VERTEX_CONVENTION).  It might
be possible to shuffle vertex index to put the last vertex in first position.

- buff[0] = head + 0;
- buff[1] = head + 1;
- buff[2] = head + 2;
+ buff[0] = head + 2;
+ buff[1] = head + 1;
+ buff[2] = head + 0;



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5718 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-14 10:18:38 +00:00
gregory.hainaut 0f603a98d5 gsdx ogl: Test the ARB_shader_subroutine GL4.0 extension
The idea was to replace shader program swith by pointer function calls inside
shaders.  At least parameters that are often changed between draw call. So far
I only ported atst and colclip. Unfortunately code is "slower" (on GSdx standalone).
For the moment keep the code but disabled.

If I understand well the validation of program is done in the "driver thread"
but the additional call are done in the overloaded MTGS thread. Apitrace
profiling shows faster GPU draw calls. Another possibility is that the driver still
need to validate the draw call because of others state change.

Here some stats on colin3 (90 frames):
without subroutine: UseProgram 125246
with subroutine: UseProgram 2906, subroutine 125945 => 3605 extra calls overhead (not
all parameters are ported to subroutine)



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5715 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-10 19:43:59 +00:00
gregory.hainaut a46b489a24 gsdx ogl: various minor optimization.
* move most of gl states into a separate namespace. Extend it to depth/stencil/blend micro state
=> save 10,000 opengl call by frame for colin mcrae 3
* Only setup blend state of first drawbuffer
* Don't request anymore a debug context on dev/release build



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5713 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-05 20:25:25 +00:00
gregory.hainaut 34045eb8f7 gsdx ogl: AMD users upgrade to 13.8 now ;)
* clean extension management and fix compilation of previous gl44 code.
* Use pixel buffer object to upload texture data.
=> avoid crash on AMD driver
=> a bit faster and probably got some margins for the future



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5712 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-03 08:29:01 +00:00
gregory.hainaut 64f783410e gsdx ogl:
* preliminary work for GL4.4 extensions (ARB_clear_texture & ARB_multi_bind). Disabled until I got a 4.4 driver
Note: I plan also to use ARB_buffer_storage
* compute texture gl option in the constructor (avoid a couple of swith case)
* redo texture unit management. Unit 0-2 for shaders, Unit 3 for texture operations. MultiBind will allow to bind 
shader input without disturbing texture binding points.


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5711 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-02 16:38:12 +00:00
gregory.hainaut 2634ee851c miss another debug stuff
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5709 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-07-28 19:08:46 +00:00
gregory.hainaut e394de86dc gsdx ogl:
* add a non-working hack: UserHacks_DateGL4, goal was to replace UserHacks_AlphaStencil
 + Detection of good/bad samples is based on primitive ID variable. However I'm not sure
 the behavior is always the same between draw call...Anyway let's keep a copy of the current
 work
* Dump integer texture into text csv
* add gl4.2 ARB_shader_image_load_store extension (needed by UserHacks_DateGL4)


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5707 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-07-28 14:40:43 +00:00
gregory.hainaut d3e1e4da0d gsdx ogl:
* allow to switch renderer with F9
* skip first frame in stat of the replayer
* drop msaa. Fxaa and internal resolution will do the job
* move texture attachment from texture object into device object (allow to keep sanely the state)
* split the write buffer and attachment setup
* completely split sampler and texture input setup
* redo GSDeviceOGL::CopyOffscreen to avoid an extra copy.



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5704 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-07-19 19:25:50 +00:00
gregory.hainaut 916a091f8a gsdx ogl: GLes
* use gles header file, disable opengl code (mainly GLX, ARB_sso, geometry shader)
* Define properly the function pointer, GLES use basic linking whereas GL must get the symbol dynamically
* cmake: properly search and set libglesv2.so
* don't use dual source blending => HW renderer work (only miss unimportant FBA)



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5701 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-07-13 11:39:45 +00:00
gregory.hainaut bd5f379044 gsdx ogl:
* uniform was wrongly set before the activation of the program (free driver only)
* Always use only 1 drawbuffer. Easier besides previous setup was wrong for convert:ps_main1

GLES trial (v3):
* add the cmake option GLES_API. Note library (libgles) are hardcoded for the moment
* Disable opengl check
* Disable gl_GetDebugMessageLogARB not supported!
* Emulate gl_DrawElementsBaseVertex, add manually the index offset (surely slow but work)
* Fix hundred of shader error (no implicit cast of integer to float...)
Unfortunately GLES doesn't support dual blending so no blend in hardware renderer. Otherwise it is fine




git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5700 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-07-12 21:12:34 +00:00
gregory.hainaut f22b366cea gsdx:
* redo glsl2h.pl script to generate only one big glsl headers.
* fix gcc warning in GSVector.h
* fix memory leak of GSDeviceOGL.m_shader
* clean shader compilation function => split generation header & drop malloc stuff
cmake:
* only rebuild shader when asked by the use. Avoid perl dependency to build pcsx2


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5698 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-07-07 16:13:11 +00:00
gregory.hainaut dc036ff59d gsdx: split shader/program/pipeline object management into a separate class
* remove the possibility to compile shader from file. Some people loads older shaders...
  The feature might be readded later


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5697 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-07-06 10:08:52 +00:00
gregory.hainaut 544c84a344 gsdx:
* try to setup advance gl context attribute. If driver doesn't support it fallback to the default.

zzogl:
* use glsl2h to generate an header shader as GSdx. Much easier to install
* Get GSUniformBufferOGL & GSVertexArrayOGL speed improvement from GSdx

cmake:
* detect current gcc version. Yield  a warning if < 4.7 or an error if < 4.5

glsl2h:
* support zzogl too
* compute md5sum to avoid useless relink


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5696 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-07-06 09:42:46 +00:00
gregory.hainaut ba2ec6fe59 all: gcc warning clean round 4
* remove unused variable
* move static function from h into cpp
* Initialized hw_by_page to 0xFFFFFFFF instead of -1 (number must be a positive integer)
* Use a s32 fore m_current_lsn instead of u32 (use -1 as error code)
Bonus: a couple of fix for clang compiler (doesn't mean that it fully compile with clang)
* remove useless __debugbreak on linux
* use short for 16bits atomic function


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5695 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-07-03 18:42:05 +00:00
gregory.hainaut d81254469a gsdx ogl:
* fix shader compilation on Nvidia (broken on r5682)
* fix various memory leak thanks to Valgrind

Cmake: fix a minor typo


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5688 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-06-30 11:18:46 +00:00
gregory.hainaut 58cacc3b1c gsdx: use size_t for loop index when it used countof macro
* fix override_GL_ARB_copy_image typo


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5687 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-06-29 12:02:03 +00:00
gregory.hainaut ca1edbf2cb gsdx ogl:
* Separate state and shader compilation into separate function
* replace various hash_map by basic array
* Compact VertexScale and offset into a single vec4
* add the new option "ogl_vertex_subdata": subdata is faster on FGLRX, test are welcome on Nvidia drivers
    0 => use map/unmap
    1 => use subdata

replay: add "linux_replay" option and compute some nice stat (mean, standard deviation)

cmake: recreate shader header at build time


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5682 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-06-26 20:09:07 +00:00
gregory.hainaut f4ce6d6fce gsdx:
* don't delete the wnd in GSclose. It can still be used later
* Properly detect the GL_ARB_gpu_shader5 extension for Fxaa
* A couple of fix in Create (GSopen1) of GSWndEGL/GSWndOGL


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5674 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-06-16 16:33:08 +00:00
gregory.hainaut 9b037a3813 pcsx2: increase a bit the minimum size of the plugin dialogs (GSdx name is too long)
gsdx:
* move gl function loading into GSwnd. Clean the header and avoid to rely on macro.
* Always require libegl for GSdx. You still need to select the API at compilation.


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5670 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-06-16 08:43:50 +00:00
gregory.hainaut b10d554764 gsdx (ogl):
* fix memory leak of m_wnd
* don't escape % in the shader string
* Fix shadeboost StretchRect parameter

debian: compress package with xz


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5669 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-06-16 07:05:57 +00:00
gregory.hainaut 4c37d25134 gsdx-ogl-wnd: fix ifdef mess on fxaa (both dx and ogl)
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5665 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-06-15 10:35:10 +00:00
gregory.hainaut 59501227b3 gsdx-ogl-wnd: implement fxaa. I directly reuse the .fx file with minor update
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5659 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-06-14 21:22:44 +00:00
gregory.hainaut e021630bb9 gsdx-ogl-wnd:
* Don't write color during stencil. Keep the old method to ease debug
* Realign date shader on DX
* keep stencil ref to 1. Reduce stencil management burder
* Fix texture pitch (was in pixels but need bytes). Fix Bleach Blade Battlers 2. Thanks Miseru for your trace

Remember note:
fxaa -> not yet implemented
msaa -> not implemented (I might drop it actually)
8 bit texture -> not yet implemented



git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5657 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-06-14 11:34:44 +00:00
gregory.hainaut 041bf4cee6 gsdx-ogl-wnd:
* detect Advanced Micro Devices for newer AMD Card...
* Mess with coordinate for StretchRect. Upscaling seem to work now. Surely the biggest diff between ogl and dx...



git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5655 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-06-11 11:14:26 +00:00
gregory.hainaut 58699923e0 gsdx-ogl-wnd:
* factorize sample object creation
* remember frame buffer attachment state
* Use a basic context on EGL. Allow to use Mesa 9.1 on AMD GPU.
* precompile vertex and geometry shader to avoid benchmark polution on replay
* Try harder to detect FGLRX driver on window
* various clean

Remain to fix the coordinate system for upscaling



git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5651 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-06-09 16:41:58 +00:00
gregory.hainaut efcb015361 gsdx-ogl-wnd:
* clean texture management & use different texture unit for various texture operation
* separate sampler and texture setup
* Always disable GL_SCISSOR_TEST before any cleaning to be sure the all pixels are reset
* properly set upscale_multiplier in the linux gui (was mixed with msaa). Unfortunately it is still broken...
* Fix EGL with GSopen1
* Use stdcall for function definition in the replay (avoid segmentation fault)
* Directly use gl_Position in shader instead of an extra user defined variable


git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5648 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-06-01 09:29:57 +00:00
gregory.hainaut acaf1ac8d5 gsdx-ogl-wnd: Make the HW renderer "work" on the opensource driver
* replace most OGL_FREE_DRIVER with a dynamic detection. Remains the context creation. Closed drivers need 3.3
* Add the CopySubImage fallback


git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5647 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-05-27 18:02:27 +00:00
gregory.hainaut 75418aba43 gsdx-ogl-wnd:
* Emulate Geometry Shader from the CPU.
* add some option to override opengl extension detection
* redo shader interface (again) to compile on the free driver

SW renderer is now working on the free driver.

To test it on your linux box use this cmake option -DEGL_API=TRUE
Note: (need opengl 3.0) I test mesa 9.2 git



git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5646 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-05-27 16:53:38 +00:00
gregory.hainaut aafa7a088a gsdx-ogl-wnd:
* fix a bad interaction when GL_ARB_SSO is supported without GL_ARB_shading_language_420pack
* try to replace struct with flat parameter in glsl interface
=> with some hacks of the free driver, I was able to compile SW renderer shader. Unfortunately
   the rendering is broken. Maybe my hack :p
* remove EGL context check (wasn't working as expected)


git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5644 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-05-26 13:05:03 +00:00
gregory.hainaut c9b3dcf581 gsdx-ogl-wnd:
* fix wrong interaction when both GL41 or GL42 aren't supported
* Add a new define to downgrade opengl requirement to test the free driver and EGL
 => As far as I can tell, they only miss geometry shader& GLSL150 (got GLSL140)
* Found a way to avoid crash in AMD driver. Skip the upload of small texture but rendering is wrong now...


git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5643 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-05-25 14:28:16 +00:00
gregory.hainaut 49cd9fd97d gsdx-ogl-wnd:
* remove an old&useles dummy geometry shader (was used to workaround amd bug)
* use a basic X11 window for GSopen1
* redo the replayer: it is now based on dlopen rather than standard static/dynamic library. AMD driver doesn't play nicely with the later...


git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5639 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-05-23 17:03:18 +00:00
gregory.hainaut 5a0367f398 gsdx-ogl-wnd: fix compilation on MSVC201x
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5638 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-05-20 16:29:56 +00:00
gregory.hainaut 6779edfe28 gsdx-ogl-wnd:
* replace both DISABLE_GL41_SSO and DISABLE_GL42 macro with a dynamic check based on the extension support
* glsl2h: use static instead of extern


git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5636 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-05-19 12:42:55 +00:00
gregory.hainaut 9376b26537 gsdx-ogl-wnd:
* load extension as gl_Function instead of glFunction


git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5635 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-05-19 09:19:20 +00:00
gregory.hainaut b8e36b44af gsdx-ogl-wnd: sync from trunk (r5550 to r5632)
Note: Code need to be fixed with latest GSdx ogl additions


git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5633 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-05-18 16:09:09 +00:00
gregory.hainaut e3d658b501 gsdx ogl: load shader from C code instead of an external file
* add a perl script to convert shader to char*
* By default use *.glsl file (handy to do some trials). 

Only drawback, glsl2h need to be manually called at the moment



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5632 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-05-18 09:17:30 +00:00
gregory.hainaut 6760ed552b gsdx ogl: fix previous commit
* clean shader properly
* use a 64 bits map to contain shader pointer


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5631 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-05-17 20:14:10 +00:00
gregory.hainaut a163677cec gsdx ogl: New config option to disable separate shader program. Not enabled by default
* workaround AMD driver bleding issue (got at least a nice rendering in GoW)
* would run on the opensource driver when they support geometry shader

Stil got some crash on the driver. Arg!!!
FGLRX user if you want to do some trials 
uncomment #define DISABLE_GL41_SSO in GSdx/config.h



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5630 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-05-17 18:28:51 +00:00
gregory.hainaut 15b255617a gsdx ogl: use immutable texture, it would avoid multiple validation from the driver
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5629 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-05-17 16:55:55 +00:00
gregory.hainaut 320d3a572e gsdx ogl: AMD fix some bugs in their driver ! (catalyst 13.4 branch). Let's remove some buggy workaround now.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5627 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-05-08 15:51:42 +00:00