pcsx2

Commit Graph

Author	SHA1	Message	Date
Gregory Hainaut	d3d5a436ea	gsdx-ogl: add code to read back depth texture	2015-05-20 08:07:40 +02:00
Gregory Hainaut	2783da4a22	gsdx-ogl: use a local buffer to store offscreen texture It will allow to read texture in // (and potentially could be useful for recording)	2015-05-18 11:29:04 +02:00
Gregory Hainaut	b1ea081fc3	gsdx-debug: improve tracing interface Basically move the format and c_str() in the macro	2015-05-17 13:05:08 +02:00
Gregory Hainaut	8a73849531	gsdx-ogl: enable multithread driver by default for nvidia + add a linux gui option to disable it (for test purpose)	2015-05-16 15:22:20 +02:00
Gregory Hainaut	02b478dfbc	gsdx: plug the new PNG wrapper Drop various duplicated code :)	2015-05-16 12:47:28 +02:00
Gregory Hainaut	8341055f3e	gsdx-ogl: don't enable ogl_texture_storage on catalyst	2015-05-16 00:31:25 +02:00
Gregory Hainaut	cfddcb7a93	gsdx-ogl: typo	2015-05-16 00:31:25 +02:00
Gregory Hainaut	0f01ba4c46	gsdx-ogl: mega boost Enable Nvidia multi thread driver optimization Enable ogl_texture_storage by default (requires for the speed boost , later the option will be removed)	2015-05-16 00:31:25 +02:00
Gregory Hainaut	6166c95325	gsdx-ogl: protect PBO with fence Safer and doesn't impact perf too much.	2015-05-15 18:32:47 +02:00
Gregory Hainaut	a5e424512c	gsdx-ogl: really avoid consecutive clean	2015-05-15 16:00:46 +02:00
Gregory Hainaut	613e215c73	gsdx-ogl: add some note for the persistent buffer + add a flush Persistent is slower (at least on my gs dump) because data is put in host instead of the video memory I don't understand why upload the data directly to the video memory is faster	2015-05-15 15:25:45 +02:00
Gregory Hainaut	5628bfb20c	gsdx-ogl: drop old code I have group so it doesn't pollute anymore gl trace	2015-05-15 15:25:45 +02:00
Gregory Hainaut	3e784d57e8	gsdx-ogl: add some flags to trace texture state goal1: avoid 2 consecutives clean of the render target goal2: only invalidate texture correctly	2015-05-12 18:03:06 +02:00
Gregory Hainaut	f37f3cb3cf	gsdx-ogl: improve texture uploading Initially we copy pitch by line in the PBO and tell the dma to only use the first valid byte. Now, we only copy useful data to the PBO. It reduce the copy and PBO memory requirement. It seems a bit faster on native resolution	2015-05-11 16:32:13 +02:00
Gregory Hainaut	4e2e9aa56c	gsdx-ogl: always read the first attachment of the fbo	2015-05-11 16:28:34 +02:00
Gregory Hainaut	1523b9534f	gsdx-debug: compact the code	2015-05-11 11:19:00 +02:00
Gregory Hainaut	cc4713d379	gsdx-debug: extend ogl debug capabilities Group opengl calls into a nice name. Apitrace shows them in a tree format that support folding. Previously it was a long flat list (10K-40K of lines by frame) I align the call number with the internal s_n variable. This way it is easy to map GSdx dump output with the GL debugger :)	2015-05-06 19:09:13 +02:00
Gregory Hainaut	73d04e33e9	gsdx ogl: clean various comment and old code	2015-05-01 20:04:23 +02:00
Gregory Hainaut	335695bd0e	purge GLES from GSdx ! mobile will use vulkan (or any new API) anyway	2015-05-01 20:02:17 +02:00
Gregory Hainaut	004fa7aea4	gsdx debug: allow to dump alpha channel as a gray texture I would love to find an image viewer that allow to mask channel of the image	2015-05-01 13:38:58 +02:00
Gregory Hainaut	c76e66f8d2	gsdx-ogl: fix read back of render target Initial code use a PBO to do asynchronous transfer. It is silly because GSdx doesn't use this free time. So let's use a sync read. Same speed but no PBO to manage.	2015-05-01 01:26:44 +02:00
Gregory Hainaut	25997647f2	gsdx-ogl: add ENABLE_OGL_PNG_OPAQUE to dump texture without alpha Alpha is nice but fully transparent texture suck The best will be an image viewer that can toggle the alpha channel	2015-04-30 23:06:54 +02:00
Gregory Hainaut	8a52fdab57	gsdx-ogl: allow to dump texture as png file -- slower (but that a debug feature) ++ smaller (40x-50x) ++ native support of alpha Require libpng++ and the define ENABLE_OGL_PNG Note: depth is not supported yet.	2015-04-30 20:02:50 +02:00
Gregory Hainaut	ee19a2789c	gsdx: move invalidation from GSDevice to GSTexture Much cleaner this way	2015-04-30 19:55:57 +02:00
Gregory Hainaut	ee244071fa	gsdx-ogl: use 64 bits counter + fix division factor I also added a counter of the real size of the texture. I have a bad overhead for pbo transfer	2015-04-25 14:18:21 +02:00
Gregory Hainaut	00e62919c5	gsdx-ogl: use countof macro instead to hardcode the size	2015-04-25 13:06:02 +02:00
Gregory Hainaut	672e3f9533	gsdx-ogl: use DSA for texture management Yeah code is much nicer :)	2015-04-24 19:34:17 +02:00
Gregory Hainaut	03e72781aa	gsdx-ogl: drop support of GL_ARB_clear_texture extension Extension is a bit slower. We use it to clear the RT but we generally use it right away so we don't avoid the FB attachment.	2015-04-24 18:15:58 +02:00
Gregory Hainaut	258b73409c	gsdx-ogl: update flags for buffer storage Fix issue with Mesa driver	2015-04-23 21:10:43 +02:00
Gregory Hainaut	f6652e9a50	gsdx-ogl: disable slow and buggy code until I found a better solution ogl_texture_storage 1 creates texture corruption. Advance date is too slow, code need to be updated (properly) to uses 2 passes only not 3 Maybe one could be enough (sometimes)	2015-04-22 09:33:41 +02:00
Gregory Hainaut	15dcf07b3b	revert previous commit Not better, worst slower. I'm afraid I will need proper fencing	2015-04-22 00:32:46 +02:00
Gregory Hainaut	8386b427ea	gsdx ogl: restore GL_MAP_COHERENT_BIT for texture upload I hope to fix the texture upload corruption I hope that impact on perf will remain small	2015-04-21 23:34:26 +02:00
Gregory Hainaut	330d14941f	gsdx-linux: support dump mode on linux It could be useful to analyze GS dump. Warning it consumes a lot of disk space.	2015-02-21 13:51:06 +01:00
Gregory Hainaut	276e3d9d1b	gsdx-ogl: always set texture parameter * Avoid bug after a pause * Not faster anyway * keep old method only for gl retracer to reduce debugging noise Remove some old&useless comment	2014-11-14 11:43:42 +01:00
Gregory Hainaut	16377f7249	gsdx-ogl: only call PixelStorei when parameters are updated It won't improve performance but it would reduce a bit the noise in gl retracer tool	2014-11-08 21:30:14 +01:00
Gregory Hainaut	47f40ed79a	gsdx-ogl: reduce pbo complexity Copy the full line into the pbo. Dma will only take GL_UNPACK_ROW_LENGTH - increase memcpy size by 2 in the pbo + single memcpy will be faster and can use sse Enable buffer_storage extension: * GL_CLIENT_STORAGE_BIT was required (it is the duty of TexSubImage to copy data into the GPU mem) * Enable the extension by default	2014-11-08 21:30:14 +01:00
Gregory Hainaut	e62af05496	gsdx-ogl: reduce complexity of clear texture Null is equivalent to a clear to 0. Note: Code is not yet used because both stencil and depth are cleared. Future note: stencil can potentially be replaced by load_store_image	2014-11-08 21:30:14 +01:00
Gregory Hainaut	b020bd76c6	gsdx-ogl: restore gles build Add the --gles build option to the linux main script Ifdef all gl code not supported on gles3 (note some will be reenabled for gles3.1) Note: it probably doesn't run anymore. My Nvidia driver doesn't support yet egl/gles so I can't test it. Feel free to contribute.	2014-03-29 11:55:02 +01:00
Gregory Hainaut	8b78551b92	gsdx-ogl: improve debugging capabilities allow to print memory transfer usage Check gl call in dev build	2014-03-25 16:36:29 +01:00
Gregory Hainaut	403518e852	gsdx-ogl: texture management Improve arb_buffer_storage implementation Try harder to align the texture buffer Strangely arb_buffer_storage is 3 times slower on my PC (nvidia) Tester are welcome! Open the ini file "ogl_texture_storage = 1" <= enable the extension "ogl_texture_storage = 0" <= disable the extension Note: you need an opengl 4.4 driver or one that support arb_buffer_storage (i.e. not catalyst)	2014-03-25 16:36:29 +01:00
gregory.hainaut	384c0c12ea	gsdx ogl: * properly detect gl nv depth extension * Always show the hack on the gui. Add a new hack option for DATE (gl4.2) only * Save the scan mode on linux too (f7) * hopefully fix some crash on some drivers... (ensure aligment 256 bits alignment, and if not use std memcpy) git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5888 96395faa-99c1-11dd-bbfe-3dabce05a288	2014-02-07 19:53:01 +00:00
gregory.hainaut	48356e31b8	linux: * use same path as game index db for cheats and cheats_ws * install the new cheat zip file on cmake and debian installer git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5850 96395faa-99c1-11dd-bbfe-3dabce05a288	2014-01-26 18:00:14 +00:00
$refraction$ refraction	5b14ca0fb9	GSDX: Clear up all compiler warnings. No changes to emulation. git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5840 96395faa-99c1-11dd-bbfe-3dabce05a288	2014-01-26 00:58:21 +00:00
gregory.hainaut	e80b002929	gsdx ogl: Flush various pending work * try to use more subroutine on VS&PS, unfortunately hit a driver crash! * Call Attach/DetachContext through GSDevice so I can unmap currently mapped buffer * Implement glsl part of GL_ARB_bindless texture, again hit another driver crash! * various fix of GL_ARB_buffer_storage. Basic benchmark show only improvement on 'cold' case, I guess it will improve smoothness * try to fix GL_clear_texture, no success so far. It seem the extension is limited to basic texture (aka no depth/stencil) git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5752 96395faa-99c1-11dd-bbfe-3dabce05a288	2013-10-24 20:54:27 +00:00
gregory.hainaut	e01c6cd9ce	gsdx ogl: the proof of concept commit * GL_ARB_shader_subroutine for perf fix for nvidia => add missing shader declaration. Nvidia got +4fps on colin3 :) For the moment only 2 PS parameters are supported. Code need to be extended to support others games that often switch shader program (like xenosaga). require GL4 class hardware and the option override_GL_ARB_shader_subroutine = 1 Note: strangely on AMD linux it is slower! * GL_ARB_shader_image_load_store for accuraccy (Date) Use a signed integer texture and reenable color buffer writing Current status: Amagami_transparency.gs & P3_battle_shadows.gs are now working on Nvidia with a small perf impact. Current implementation detail: 1/ setup the standard stencil as before 2/ on remaining pixel, draw once to compute first primitive that will write a fail alpha value. 3/ final draw based on primitive id of step 2 Note: I think we would get a bad behavior if depth test&mask are enabled on step 2/3 Note2: on my limited testcase the perf impact was on CPU. It would be possible to merge step1&2 to nullifying it (could even be faster actually), however it would require more GPU power. Again require GL4 class hardware. And the option UserHacks_DateGL4 = 1 git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5725 96395faa-99c1-11dd-bbfe-3dabce05a288	2013-08-28 08:44:16 +00:00
ramapcsx2.code	3aa0f374d4	Just fixing this oversight. Thanks, gb2985. git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5723 96395faa-99c1-11dd-bbfe-3dabce05a288	2013-08-21 10:10:45 +00:00
gregory.hainaut	690432de30	gsdx ogl: * redo most of the texture upload (PBO): colin3 benchmark: 32 fps now (vs 26 fps 2 weeks ago) * use the cross vendor vsync extension on linux (previous wasn't supported by nvidia) git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5721 96395faa-99c1-11dd-bbfe-3dabce05a288	2013-08-17 09:05:41 +00:00
gregory.hainaut	07605941ef	gsdx ogl: * some preliminary work to test/benchmark bindless texture in the future (glsl was not yet updated) Bindless texture allow to get a GPU texture pointer and then set it directly to the shader as a basic uniform. => no more texture unit selection/validation => no more texture validation neither texture hash lookup 3rdparty: update gl header to the latest gl4.4 git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5720 96395faa-99c1-11dd-bbfe-3dabce05a288	2013-08-17 08:57:52 +00:00
gregory.hainaut	0f603a98d5	gsdx ogl: Test the ARB_shader_subroutine GL4.0 extension The idea was to replace shader program swith by pointer function calls inside shaders. At least parameters that are often changed between draw call. So far I only ported atst and colclip. Unfortunately code is "slower" (on GSdx standalone). For the moment keep the code but disabled. If I understand well the validation of program is done in the "driver thread" but the additional call are done in the overloaded MTGS thread. Apitrace profiling shows faster GPU draw calls. Another possibility is that the driver still need to validate the draw call because of others state change. Here some stats on colin3 (90 frames): without subroutine: UseProgram 125246 with subroutine: UseProgram 2906, subroutine 125945 => 3605 extra calls overhead (not all parameters are ported to subroutine) git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5715 96395faa-99c1-11dd-bbfe-3dabce05a288	2013-08-10 19:43:59 +00:00
gregory.hainaut	a46b489a24	gsdx ogl: various minor optimization. * move most of gl states into a separate namespace. Extend it to depth/stencil/blend micro state => save 10,000 opengl call by frame for colin mcrae 3 * Only setup blend state of first drawbuffer * Don't request anymore a debug context on dev/release build git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5713 96395faa-99c1-11dd-bbfe-3dabce05a288	2013-08-05 20:25:25 +00:00

1 2

96 Commits