Commit Graph

97 Commits

Author SHA1 Message Date
Gregory Hainaut 8393ba56d6 gsdx-ogl: rework palette texture handling
Redirect the red channel to alpha channel for 8 bits texture.

It avoid special management in the shader
2015-06-24 19:50:09 +02:00
Gregory Hainaut d3d5a436ea gsdx-ogl: add code to read back depth texture 2015-05-20 08:07:40 +02:00
Gregory Hainaut 2783da4a22 gsdx-ogl: use a local buffer to store offscreen texture
It will allow to read texture in // (and potentially could be useful
for recording)
2015-05-18 11:29:04 +02:00
Gregory Hainaut b1ea081fc3 gsdx-debug: improve tracing interface
Basically move the format and c_str() in the macro
2015-05-17 13:05:08 +02:00
Gregory Hainaut 8a73849531 gsdx-ogl: enable multithread driver by default for nvidia
+ add a linux gui option to disable it (for test purpose)
2015-05-16 15:22:20 +02:00
Gregory Hainaut 02b478dfbc gsdx: plug the new PNG wrapper
Drop various duplicated code :)
2015-05-16 12:47:28 +02:00
Gregory Hainaut 8341055f3e gsdx-ogl: don't enable ogl_texture_storage on catalyst 2015-05-16 00:31:25 +02:00
Gregory Hainaut cfddcb7a93 gsdx-ogl: typo 2015-05-16 00:31:25 +02:00
Gregory Hainaut 0f01ba4c46 gsdx-ogl: mega boost
Enable Nvidia multi thread driver optimization
Enable ogl_texture_storage by default (requires for the speed boost
, later the option will be removed)
2015-05-16 00:31:25 +02:00
Gregory Hainaut 6166c95325 gsdx-ogl: protect PBO with fence
Safer and doesn't impact perf too much.
2015-05-15 18:32:47 +02:00
Gregory Hainaut a5e424512c gsdx-ogl: really avoid consecutive clean 2015-05-15 16:00:46 +02:00
Gregory Hainaut 613e215c73 gsdx-ogl: add some note for the persistent buffer + add a flush
Persistent is slower (at least on my gs dump) because data is put
in  host instead of the video memory

I don't understand why upload the data directly to the video memory
is faster
2015-05-15 15:25:45 +02:00
Gregory Hainaut 5628bfb20c gsdx-ogl: drop old code
I have group so it doesn't pollute anymore gl trace
2015-05-15 15:25:45 +02:00
Gregory Hainaut 3e784d57e8 gsdx-ogl: add some flags to trace texture state
goal1: avoid 2 consecutives clean of the render target
goal2: only invalidate texture correctly
2015-05-12 18:03:06 +02:00
Gregory Hainaut f37f3cb3cf gsdx-ogl: improve texture uploading
Initially we copy pitch by line in the PBO and tell the dma to only
use the first valid byte.

Now, we only copy useful data to the PBO. It reduce the copy and PBO memory requirement.

It seems a bit faster on native resolution
2015-05-11 16:32:13 +02:00
Gregory Hainaut 4e2e9aa56c gsdx-ogl: always read the first attachment of the fbo 2015-05-11 16:28:34 +02:00
Gregory Hainaut 1523b9534f gsdx-debug: compact the code 2015-05-11 11:19:00 +02:00
Gregory Hainaut cc4713d379 gsdx-debug: extend ogl debug capabilities
Group opengl calls into a nice name.

Apitrace shows them in a tree format that support folding. Previously it
was a long flat list (10K-40K of lines by frame)

I align the call number with the internal s_n variable. This way it is
easy to map GSdx dump output with the GL debugger :)
2015-05-06 19:09:13 +02:00
Gregory Hainaut 73d04e33e9 gsdx ogl: clean various comment and old code 2015-05-01 20:04:23 +02:00
Gregory Hainaut 335695bd0e purge GLES from GSdx !
mobile will use vulkan (or any new API) anyway
2015-05-01 20:02:17 +02:00
Gregory Hainaut 004fa7aea4 gsdx debug: allow to dump alpha channel as a gray texture
I would love to find an image viewer that allow to mask channel of the image
2015-05-01 13:38:58 +02:00
Gregory Hainaut c76e66f8d2 gsdx-ogl: fix read back of render target
Initial code use a PBO to do asynchronous transfer. It is silly because
GSdx doesn't use this free time. So let's use a sync read. Same speed but
no PBO to manage.
2015-05-01 01:26:44 +02:00
Gregory Hainaut 25997647f2 gsdx-ogl: add ENABLE_OGL_PNG_OPAQUE to dump texture without alpha
Alpha is nice but fully transparent texture suck

The best will be an image viewer that can toggle the alpha channel
2015-04-30 23:06:54 +02:00
Gregory Hainaut 8a52fdab57 gsdx-ogl: allow to dump texture as png file
-- slower (but that a debug feature)
++ smaller (40x-50x)
++ native support of alpha

Require libpng++ and the define ENABLE_OGL_PNG

Note: depth is not supported yet.
2015-04-30 20:02:50 +02:00
Gregory Hainaut ee19a2789c gsdx: move invalidation from GSDevice to GSTexture
Much cleaner this way
2015-04-30 19:55:57 +02:00
Gregory Hainaut ee244071fa gsdx-ogl: use 64 bits counter + fix division factor
I also added a counter of the real size of the texture.

I have a bad overhead for pbo transfer
2015-04-25 14:18:21 +02:00
Gregory Hainaut 00e62919c5 gsdx-ogl: use countof macro instead to hardcode the size 2015-04-25 13:06:02 +02:00
Gregory Hainaut 672e3f9533 gsdx-ogl: use DSA for texture management
Yeah code is much nicer :)
2015-04-24 19:34:17 +02:00
Gregory Hainaut 03e72781aa gsdx-ogl: drop support of GL_ARB_clear_texture extension
Extension is a bit slower.

We use it to clear the RT but we generally use it right away so
we don't avoid the FB attachment.
2015-04-24 18:15:58 +02:00
Gregory Hainaut 258b73409c gsdx-ogl: update flags for buffer storage
Fix issue with Mesa driver
2015-04-23 21:10:43 +02:00
Gregory Hainaut f6652e9a50 gsdx-ogl: disable slow and buggy code until I found a better solution
ogl_texture_storage 1 creates texture corruption.

Advance date is too slow, code need to be updated (properly) to uses 2 passes only not 3
Maybe one could be enough (sometimes)
2015-04-22 09:33:41 +02:00
Gregory Hainaut 15dcf07b3b revert previous commit
Not better, worst slower.

I'm afraid I will need proper fencing
2015-04-22 00:32:46 +02:00
Gregory Hainaut 8386b427ea gsdx ogl: restore GL_MAP_COHERENT_BIT for texture upload
I hope to fix the texture upload corruption
I hope that impact on perf will remain small
2015-04-21 23:34:26 +02:00
Gregory Hainaut 330d14941f gsdx-linux: support dump mode on linux
It could be useful to analyze GS dump. Warning it consumes a lot of
disk space.
2015-02-21 13:51:06 +01:00
Gregory Hainaut 276e3d9d1b gsdx-ogl: always set texture parameter
* Avoid bug after a pause
* Not faster anyway
* keep old method only for gl retracer to reduce debugging noise

Remove some old&useless comment
2014-11-14 11:43:42 +01:00
Gregory Hainaut 16377f7249 gsdx-ogl: only call PixelStorei when parameters are updated
It won't improve performance but it would reduce a bit the noise in gl retracer tool
2014-11-08 21:30:14 +01:00
Gregory Hainaut 47f40ed79a gsdx-ogl: reduce pbo complexity
Copy the full line into the pbo. Dma will only take GL_UNPACK_ROW_LENGTH
- increase memcpy size by 2 in the pbo
+ single memcpy will be faster and can use sse

Enable buffer_storage extension:
* GL_CLIENT_STORAGE_BIT was required (it is the duty of TexSubImage to copy data into the GPU mem)
* Enable the extension by default
2014-11-08 21:30:14 +01:00
Gregory Hainaut e62af05496 gsdx-ogl: reduce complexity of clear texture
Null is equivalent to a clear to 0.

Note: Code is not yet used because both stencil and depth are cleared.
Future note: stencil can potentially be replaced by load_store_image
2014-11-08 21:30:14 +01:00
Gregory Hainaut b020bd76c6 gsdx-ogl: restore gles build
Add the --gles build option to the linux main script
Ifdef all gl code not supported on gles3 (note some will be reenabled for gles3.1)

Note: it probably doesn't run anymore. My Nvidia driver doesn't support
yet egl/gles so I can't test it. Feel free to contribute.
2014-03-29 11:55:02 +01:00
Gregory Hainaut 8b78551b92 gsdx-ogl: improve debugging capabilities
allow to print memory transfer usage
Check gl call in dev build
2014-03-25 16:36:29 +01:00
Gregory Hainaut 403518e852 gsdx-ogl: texture management
Improve arb_buffer_storage implementation
Try harder to align the texture buffer

Strangely arb_buffer_storage is 3 times slower on my PC (nvidia)

Tester are welcome! Open the ini file
"ogl_texture_storage = 1" <= enable the extension
"ogl_texture_storage = 0" <= disable the extension

Note: you need an opengl 4.4 driver or one that support arb_buffer_storage (i.e. not catalyst)
2014-03-25 16:36:29 +01:00
gregory.hainaut 384c0c12ea gsdx ogl:
* properly detect gl nv depth extension
* Always show the hack on the gui. Add a new hack option for DATE (gl4.2) only
* Save the scan mode on linux too (f7)
* hopefully fix some crash on some drivers... (ensure aligment 256 bits alignment, and if not use std memcpy)


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5888 96395faa-99c1-11dd-bbfe-3dabce05a288
2014-02-07 19:53:01 +00:00
gregory.hainaut 48356e31b8 linux:
* use same path as game index db for cheats and cheats_ws
* install the new cheat zip file on cmake and debian installer 


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5850 96395faa-99c1-11dd-bbfe-3dabce05a288
2014-01-26 18:00:14 +00:00
refraction 5b14ca0fb9 GSDX: Clear up all compiler warnings. No changes to emulation.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5840 96395faa-99c1-11dd-bbfe-3dabce05a288
2014-01-26 00:58:21 +00:00
gregory.hainaut e80b002929 gsdx ogl: Flush various pending work
* try to use more subroutine on VS&PS, unfortunately hit a driver crash!
* Call Attach/DetachContext through GSDevice so I can unmap currently mapped buffer
* Implement glsl part of GL_ARB_bindless texture, again hit another driver crash!
* various fix of GL_ARB_buffer_storage. Basic benchmark show only improvement on 'cold' case, I guess it will improve smoothness
* try to fix GL_clear_texture, no success so far. It seem the extension is limited to basic texture (aka no depth/stencil)



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5752 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-10-24 20:54:27 +00:00
gregory.hainaut e01c6cd9ce gsdx ogl: the proof of concept commit
* GL_ARB_shader_subroutine for perf
fix for nvidia => add missing shader declaration. Nvidia got +4fps on colin3 :) 

For the moment only 2 PS parameters are supported. Code need to be extended to support others games that often
switch shader program (like xenosaga).
require GL4 class hardware and the option override_GL_ARB_shader_subroutine = 1
Note: strangely on AMD linux it is slower!

* GL_ARB_shader_image_load_store for accuraccy (Date)
Use a signed integer texture and reenable color buffer writing

Current status: Amagami_transparency.gs & P3_battle_shadows.gs are now working on Nvidia with a small perf impact.
Current implementation detail:
1/ setup the standard stencil as before
2/ on remaining pixel, draw once to compute first primitive that will write a fail alpha value.
3/ final draw based on primitive id of step 2

Note: I think we would get a bad behavior if depth test&mask are enabled on step 2/3
Note2: on my limited testcase the perf impact was on CPU. It would be possible to merge step1&2 to nullifying it (could 
even be faster actually), however it would require more GPU power.

Again require GL4 class hardware. And the option UserHacks_DateGL4 = 1



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5725 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-28 08:44:16 +00:00
ramapcsx2.code 3aa0f374d4 Just fixing this oversight. Thanks, gb2985.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5723 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-21 10:10:45 +00:00
gregory.hainaut 690432de30 gsdx ogl:
* redo most of the texture upload (PBO): colin3 benchmark: 32 fps now (vs 26 fps 2 weeks ago)
* use the cross vendor vsync extension on linux (previous wasn't supported by nvidia)


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5721 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-17 09:05:41 +00:00
gregory.hainaut 07605941ef gsdx ogl:
* some preliminary work to test/benchmark bindless texture in the future (glsl was not yet updated)
Bindless texture allow to get a GPU texture pointer and then set it directly
to the shader as a basic uniform.
=> no more texture unit selection/validation
=> no more texture validation neither texture hash lookup

3rdparty: update gl header to the latest gl4.4


git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5720 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-17 08:57:52 +00:00
gregory.hainaut 0f603a98d5 gsdx ogl: Test the ARB_shader_subroutine GL4.0 extension
The idea was to replace shader program swith by pointer function calls inside
shaders.  At least parameters that are often changed between draw call. So far
I only ported atst and colclip. Unfortunately code is "slower" (on GSdx standalone).
For the moment keep the code but disabled.

If I understand well the validation of program is done in the "driver thread"
but the additional call are done in the overloaded MTGS thread. Apitrace
profiling shows faster GPU draw calls. Another possibility is that the driver still
need to validate the draw call because of others state change.

Here some stats on colin3 (90 frames):
without subroutine: UseProgram 125246
with subroutine: UseProgram 2906, subroutine 125945 => 3605 extra calls overhead (not
all parameters are ported to subroutine)



git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5715 96395faa-99c1-11dd-bbfe-3dabce05a288
2013-08-10 19:43:59 +00:00