Gregory Hainaut
15ae9996bb
glsl: format white space + comment
2015-04-25 12:50:12 +02:00
Gregory Hainaut
757726bb91
gsdx-ogl: allow to invalidate the texture
...
It just a hint to the driver to avoid any useless transfer
I don't expect any change but it is free so why not ;)
2015-04-25 12:50:12 +02:00
Gregory Hainaut
75817bb27b
gsdx-ogl: add a quick and dirty DSA layer emulation
...
The global idea is to use
1/ bind in tight loop
2/ DSA otherwise (to avoid any binding in tight loop)
2015-04-25 12:50:11 +02:00
Gregory Hainaut
eb257d9295
gsdx-ogl: add dsa function place holder
2015-04-25 12:50:11 +02:00
Gregory Hainaut
baf84b98c4
gsdx-ogl: add override_GL_ARB_texture_barrier option
...
To ease regression test.
2015-04-25 09:59:25 +02:00
Gregory Hainaut
b12eb45bb7
gsdx-ogl: try to avoid crash on fglrx windows
2015-04-25 09:50:19 +02:00
Gregory Hainaut
f0182f9a66
windows requires APIENTRY (=> stdcall)
2015-04-25 01:32:09 +02:00
Gregory Hainaut
7b0775d887
gsdx-ogl: add some fences to protect the upload of vbo buffer
...
This way ogl_vertex_storage must be safer to activate
And it brings a nice performance boost (game with lots of primitives and
reasonable upscaling)
SotC testcase 4x: 61fps => 78fps
2015-04-24 23:15:19 +02:00
Gregory Hainaut
36514bd95f
glsl: fog is a single byte
...
Give a chance to the driver to optimize if possible
2015-04-24 21:37:37 +02:00
Gregory Hainaut
c207632e49
gsdx-ogl: improve date performance for GL45
...
If there is no overlap, it is allowed to directly read from the render target.
On SotC testcase with 6x scaling: 30fps -> 40fps
Note: it requires GL_ARB_texture_barrier extension so be sure to have a recent driver
Note2: it requires a lots of testing too
Open question: in case of complex date (written alpha)
Will it be faster to split the draw call into multiple call with no
primitive overlap
2015-04-24 21:12:33 +02:00
Gregory Hainaut
795ae50ecd
gsdx-ogl: fix the recently broken advance date feature
...
Now it is really working with a 2 stages shaders but it is still slow.
2015-04-24 20:13:38 +02:00
Gregory Hainaut
672e3f9533
gsdx-ogl: use DSA for texture management
...
Yeah code is much nicer :)
2015-04-24 19:34:17 +02:00
Gregory Hainaut
6d31d1e0d0
gsdx-ogl: add a layer to emulate DSA for texture
...
Framebuffer function will be nice too
2015-04-24 19:32:00 +02:00
Gregory Hainaut
f71eb171cf
gsdx-ogl: add glTextureBarrier function pointer
...
Could be useful
2015-04-24 18:35:01 +02:00
Gregory Hainaut
6e386df535
gsdx-ogl: avoid to clean fully texture in DATE
...
Is is useless and it has a small impact on performance for big upscale
2015-04-24 18:32:08 +02:00
Gregory Hainaut
03e72781aa
gsdx-ogl: drop support of GL_ARB_clear_texture extension
...
Extension is a bit slower.
We use it to clear the RT but we generally use it right away so
we don't avoid the FB attachment.
2015-04-24 18:15:58 +02:00
Gregory Hainaut
89d5e5637c
glsl: use an explicit cast insead of notEqual function
...
If the compiler didn't optimize the code, it will be a bit faster
2015-04-24 18:01:25 +02:00
Gregory Hainaut
56836561f4
glsl: replace runtime condition by preprocessor condition
...
It might be an easier work for the compiler
I didn't replace all occurences to keep readability
2015-04-24 17:51:29 +02:00
Gregory Hainaut
4bb8d15228
gsdx: be more verbose on title and bandwidth debug
2015-04-24 17:13:56 +02:00
Gregory Hainaut
258b73409c
gsdx-ogl: update flags for buffer storage
...
Fix issue with Mesa driver
2015-04-23 21:10:43 +02:00
Gregory Hainaut
f6652e9a50
gsdx-ogl: disable slow and buggy code until I found a better solution
...
ogl_texture_storage 1 creates texture corruption.
Advance date is too slow, code need to be updated (properly) to uses 2 passes only not 3
Maybe one could be enough (sometimes)
2015-04-22 09:33:41 +02:00
Gregory Hainaut
b32f808fd4
gsdx-ogl: increase the number of pbo
...
It would cost 16MB of extra storage on the GPU but it might
reduce conflict of texture upload.
2015-04-22 00:40:38 +02:00
Gregory Hainaut
bd6ea17bdc
gsdx-ogl: speed improvement for DATE
...
DATE is implemented in 2 ways.
1/ with stencil
2/ purely in FS (sw)
I kept method 1 to reduce the work on method 2. It sucks for performance.
So it would be either 1 or 2.
Note: DATE has a big impact on higher upscaling
Note2: you can disable the 2nd method with this configuration parameter
override_GL_ARB_shader_image_load_store = 0
2015-04-22 00:36:34 +02:00
Gregory Hainaut
15dcf07b3b
revert previous commit
...
Not better, worst slower.
I'm afraid I will need proper fencing
2015-04-22 00:32:46 +02:00
Gregory Hainaut
8386b427ea
gsdx ogl: restore GL_MAP_COHERENT_BIT for texture upload
...
I hope to fix the texture upload corruption
I hope that impact on perf will remain small
2015-04-21 23:34:26 +02:00
Gregory Hainaut
19eb1f00d1
gsdx ogl: flush vbo range instead of barrier
...
For testing purpose. I don't know which one is better.
It seems flushing have less fps fluctuation than barrier.
2015-04-21 21:44:50 +02:00
Gregory Hainaut
ce98276322
gsdx-ogl: improve speed of vertex streaming
...
Note yet enabled because I'm afraid of data corruption but feel free to test it
The option:
ogl_vertex_storage = 1
Performance note (warm cache+gs replay on colin3)
60 fps -> 76 fps
2015-04-20 09:38:03 +02:00
Gregory Hainaut
62489f42f1
gsdx-ogl: add an optimization note for later
...
Only 1 byte of fog is useful
2015-04-20 07:18:09 +02:00
Gregory Hainaut
6d253c0b8f
glsl: fix debugging of tex coordinate in apitrace
2015-04-20 07:18:08 +02:00
Gregory Hainaut
31f8c065db
gsdx-ogl: implement a new hack UserHacks_UnscaleSprite for opengl
...
UserHacks_UnscaleSprite = 1 will unscale flat sprites
UserHacks_UnscaleSprite = 2 will unscale all sprites (don't work well so far)
The idea of the hack is to redo the interpolation of texture coordinate
based on the non-upscaled pixel position.
It avoids various glitches but sprites aren't upscaled anymore (so no
more anti-aliasing, potentially a coefficient can be added).
2015-04-20 07:18:08 +02:00
Gregory Hainaut
6124eb844e
gsdx-ogl: only compile useful VS
...
logz is a constant
wildhack is only compatbile with TME/FST
Compilation goes down from 64 to 20 vertex shaders.
2015-04-20 07:17:58 +02:00
Gregory Hainaut
16e6d0d305
glsl: move shader into a separate directory
...
Only keep glsl_source.h for clarity
2015-04-19 18:49:02 +02:00
Gregory Hainaut
55fdf26898
glsl: remove the older file tfx.glsl
2015-04-19 18:49:02 +02:00
Gregory Hainaut
15264c6c63
glsl: split the main shader
...
* separate VS/GS and FS
* separate subroutine part of the FS
It already complex enough without subroutine stuff. Besides I'm not sure
we will keep subroutine on the future.
2015-04-19 18:49:02 +02:00
Gregory Hainaut
6fc9afb175
Merge pull request #507 from PCSX2/stdcall-for-plugin
...
Stdcall for plugin
2015-04-19 18:48:32 +02:00
Gregory Hainaut
1d70865f09
Merge branch 'gsdx-boost-queue'
2015-04-17 19:13:32 +02:00
Gregory Hainaut
e605ed1d09
gsdx-queue: add a comment for the future
2015-04-17 19:12:36 +02:00
Gregory Hainaut
fa243afbab
gsdx SW: enable new queue && C++11 on linux/MSVC 2012+
2015-04-17 19:12:36 +02:00
Gregory Hainaut
d91e989abb
gsdx-queue: pass shared_ptr by reference
...
It avoids atomic +1/-1 of the reference counter
The counter is still incremented when the ptr is copyed into the queue
2015-04-17 19:12:36 +02:00
Gregory Hainaut
84b33d2ddb
gsdx-queue: plug the new queue as a drop-off replacement of previous boost queue
2015-04-17 19:12:36 +02:00
Gregory Hainaut
90794c302a
gsdx-queue: import spsc_queue of boost
...
I remove 80% of the file to only keep the ring buffer core
Same speed as boost but without the boost dependency
2015-04-17 19:12:36 +02:00
Gregory Hainaut
c9194301a0
gsdx-queue: (linux) add a GUI option to select the queue
2015-04-17 19:12:33 +02:00
Gregory Hainaut
0aac47ca59
gsdx-queue: add a new option "spin_thread" to select the queue behavior at runtime
...
If someone has a more elegant solution, feel free to share it
spin_thread = 0
spin_thread = 1 // the faster but GS thread will never stop, very bad for laptop
2015-04-17 19:03:21 +02:00
Gregory Hainaut
9682061472
gsdx-queue:add a new job dispatcher queue based on boost and C++11
...
It is faster on linux, it requires less code, and it is "portable"
It requires boost (only hpp files) + MSVC 2013 (for atomic) (seem doable by 2012 too)
Actually there are several queues that either use spinlock or full sleep
2015-04-17 19:03:21 +02:00
Gregory Hainaut
a75d78bd7e
gsdx: use standard lock_guard instead of GSAutoLock
2015-04-17 19:03:21 +02:00
Gregory Hainaut
9ad5933120
gsdx: Use composition insead of inheritance to support lock
...
To ease update to C++11
2015-04-17 19:03:21 +02:00
Gregory Hainaut
8deee6afbc
gsdx: include some C++11 define for later
2015-04-17 19:03:21 +02:00
Gregory Hainaut
9ce7f515bc
cdvdiso: add stdcall convention
2015-04-17 18:34:05 +02:00
Gregory Hainaut
1cb047687f
common: use stdcall convention too
...
(Likely used by others null plugins)
2015-04-17 18:33:26 +02:00
Gregory Hainaut
5c8ea74cb9
null plugins: add stdcall convention
2015-04-17 18:33:10 +02:00