pcsx2

Commit Graph

Author	SHA1	Message	Date
Gregory Hainaut	46ff4dc3d3	gsdx-ogl: hardware unit only support normalization of 4 bytes... (At least on recent AMD GPU)	2015-04-27 18:51:59 +02:00
Gregory Hainaut	ee244071fa	gsdx-ogl: use 64 bits counter + fix division factor I also added a counter of the real size of the texture. I have a bad overhead for pbo transfer	2015-04-25 14:18:21 +02:00
Gregory Hainaut	47a0026b60	gsdx-ogl: print the bandwidth of uniform	2015-04-25 13:00:03 +02:00
Gregory Hainaut	757726bb91	gsdx-ogl: allow to invalidate the texture It just a hint to the driver to avoid any useless transfer I don't expect any change but it is free so why not ;)	2015-04-25 12:50:12 +02:00
Gregory Hainaut	36514bd95f	glsl: fog is a single byte Give a chance to the driver to optimize if possible	2015-04-24 21:37:37 +02:00
Gregory Hainaut	c207632e49	gsdx-ogl: improve date performance for GL45 If there is no overlap, it is allowed to directly read from the render target. On SotC testcase with 6x scaling: 30fps -> 40fps Note: it requires GL_ARB_texture_barrier extension so be sure to have a recent driver Note2: it requires a lots of testing too Open question: in case of complex date (written alpha) Will it be faster to split the draw call into multiple call with no primitive overlap	2015-04-24 21:12:33 +02:00
Gregory Hainaut	795ae50ecd	gsdx-ogl: fix the recently broken advance date feature Now it is really working with a 2 stages shaders but it is still slow.	2015-04-24 20:13:38 +02:00
Gregory Hainaut	672e3f9533	gsdx-ogl: use DSA for texture management Yeah code is much nicer :)	2015-04-24 19:34:17 +02:00
Gregory Hainaut	6e386df535	gsdx-ogl: avoid to clean fully texture in DATE Is is useless and it has a small impact on performance for big upscale	2015-04-24 18:32:08 +02:00
Gregory Hainaut	03e72781aa	gsdx-ogl: drop support of GL_ARB_clear_texture extension Extension is a bit slower. We use it to clear the RT but we generally use it right away so we don't avoid the FB attachment.	2015-04-24 18:15:58 +02:00
Gregory Hainaut	19eb1f00d1	gsdx ogl: flush vbo range instead of barrier For testing purpose. I don't know which one is better. It seems flushing have less fps fluctuation than barrier.	2015-04-21 21:44:50 +02:00
Gregory Hainaut	ce98276322	gsdx-ogl: improve speed of vertex streaming Note yet enabled because I'm afraid of data corruption but feel free to test it The option: ogl_vertex_storage = 1 Performance note (warm cache+gs replay on colin3) 60 fps -> 76 fps	2015-04-20 09:38:03 +02:00
Gregory Hainaut	62489f42f1	gsdx-ogl: add an optimization note for later Only 1 byte of fog is useful	2015-04-20 07:18:09 +02:00
Gregory Hainaut	31f8c065db	gsdx-ogl: implement a new hack UserHacks_UnscaleSprite for opengl UserHacks_UnscaleSprite = 1 will unscale flat sprites UserHacks_UnscaleSprite = 2 will unscale all sprites (don't work well so far) The idea of the hack is to redo the interpolation of texture coordinate based on the non-upscaled pixel position. It avoids various glitches but sprites aren't upscaled anymore (so no more anti-aliasing, potentially a coefficient can be added).	2015-04-20 07:18:08 +02:00
Gregory Hainaut	6124eb844e	gsdx-ogl: only compile useful VS logz is a constant wildhack is only compatbile with TME/FST Compilation goes down from 64 to 20 vertex shaders.	2015-04-20 07:17:58 +02:00
Gregory Hainaut	15264c6c63	glsl: split the main shader * separate VS/GS and FS * separate subroutine part of the FS It already complex enough without subroutine stuff. Besides I'm not sure we will keep subroutine on the future.	2015-04-19 18:49:02 +02:00
Gregory Hainaut	418f2e69a8	gsdx-ogl: implement the wildhack on the GPU Likely much faster for opengl and much easier to implement Note: hopefully UserHacks_round_sprite_offset will replace it	2015-04-13 22:14:36 +02:00
Gregory Hainaut	8c90e7cafc	gsdx-ogl: support latest fxaa version Only tested on Nvidia, please report any issue with your driver Note: requires GL4 GPU	2014-11-10 10:39:55 +01:00
Gregory Hainaut	ff39dffe23	gsdx-ogl: add a gui option (linux) to select external shader Note: of course it requires a glsl shader ;) On windows, you can set the path on the ini file. Here an example with linux path: shaderfx_conf = /home/gregory/playstation/emulateur/pcsx2_merge/bin/GSdx_FX_Settings.ini shaderfx_glsl = /home/gregory/playstation/emulateur/pcsx2_merge/bin/shader.fx	2014-11-10 10:38:52 +01:00
Gregory Hainaut	920ac6695f	gsdx-ogl: add preliminary support of external shader fx	2014-11-10 10:37:58 +01:00
Gregory Hainaut	e62af05496	gsdx-ogl: reduce complexity of clear texture Null is equivalent to a clear to 0. Note: Code is not yet used because both stencil and depth are cleared. Future note: stencil can potentially be replaced by load_store_image	2014-11-08 21:30:14 +01:00
Tom Burnett	1f734a69a0	Small VS2013 fixes	2014-10-12 01:40:40 -07:00
Gregory Hainaut	ccc1137e12	gsdx-ogl: merge the two vertex buffer format * Only a single VAO => Format is set once => Only a single bind at startup => GSVertexBufferStateOGL is nearly useless => barely faster but better than nothing :)	2014-10-02 20:44:22 +02:00
Gregory Hainaut	10c7be8c50	gsdx-ogl: Use 32B strides for all VBO	2014-10-02 20:44:22 +02:00
Gregory Hainaut	79e8a912cd	gsdx-ogl: keep the draw buffer enabled by default Note: Only DATE requires to disable the draw buffer	2014-09-30 22:18:20 +02:00
Gregory Hainaut	f46e8cc6ac	gsdx-ogl: bump base requirement to 3.3 A couple of fallbacks were introduced for the Mesa driver that only support 3.0 DSA will require a recent Mesa which already support GL3.3 Require at least SandyBridge for Intel GPU	2014-09-30 22:18:20 +02:00
Gregory Hainaut	594f6c33a2	gsdx-ogl-ES: require GL_EXT_shader_io_blocks + GLES3.1 Allow to use same shader interface for all API Note: on the GL API it will require GL3.3 (see next commit)	2014-09-30 22:18:01 +02:00
Gregory Hainaut	8833afc2fa	gsdx-ogl: drop GL_ARB_multi_bind It will be replaced by DSA so let's reduce the complexity of opengl	2014-09-28 12:23:44 +02:00
Gregory Hainaut	1e86e3cb08	gsdx-ogl: rework callback debug * use DebugOutputToFile as a callback of gl error. Add a breakpoint to find the culprit GL call * use string instead of char[n] Note: CheckDebugLog is potentially useless now	2014-09-28 12:00:34 +02:00
Gregory Hainaut	9d8d702aa6	gsdx-ogl: drop GL_NV_depth_clamp extension superseeded by GL_ARB_clip_control	2014-09-28 12:00:34 +02:00
Gregory Hainaut	4659184cc1	gsdx-ogl: add support of clip_control (depth only) * replace the [-1;1] depth range of openGL with the DX range [0;1].	2014-09-28 12:00:34 +02:00
Gregory Hainaut	cc24da128c	gsdx-ogl: fix for gl_clear_texture Note: Disabled for depth_stencil texture (I'm not sure we can split the two)	2014-09-28 12:00:34 +02:00
Gregory Hainaut	58a8683d7d	gsdx-ogl: disable texture compare mode It seems to be used for depth texture	2014-09-22 09:27:34 +02:00
Gregory Hainaut	d51f008c72	gsdx: openglES fix * require a 3.1 context * unattach texture of the fbo when they're not used (avoid to have a texture and depth_stencil with different size) Note: except minor shader bug it works on Nvidia 340.23.01	2014-09-22 09:27:31 +02:00
Gregory Hainaut	0d45e6d70e	gsdx-ogl: avoid to send constant to the GPU It was a waste of bandwith	2014-04-06 10:44:40 +02:00
Gregory Hainaut	b020bd76c6	gsdx-ogl: restore gles build Add the --gles build option to the linux main script Ifdef all gl code not supported on gles3 (note some will be reenabled for gles3.1) Note: it probably doesn't run anymore. My Nvidia driver doesn't support yet egl/gles so I can't test it. Feel free to contribute.	2014-03-29 11:55:02 +01:00
Gregory Hainaut	8b78551b92	gsdx-ogl: improve debugging capabilities allow to print memory transfer usage Check gl call in dev build	2014-03-25 16:36:29 +01:00
Gregory Hainaut	41091f8ebf	gsdx ogl: remove multithread hack This hack was used because GSReadFifo was called from the EE thread. Previous commit move the call to the GSThread. Hopefully avoid flushing the full GPU contex would improve openGL performance (at least avoid some hiccups ;) ) Note: newer GSdx ogl won't be compatible with older PCSX2	2014-03-25 16:36:29 +01:00
gregory.hainaut	384c0c12ea	gsdx ogl: * properly detect gl nv depth extension * Always show the hack on the gui. Add a new hack option for DATE (gl4.2) only * Save the scan mode on linux too (f7) * hopefully fix some crash on some drivers... (ensure aligment 256 bits alignment, and if not use std memcpy) git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5888 96395faa-99c1-11dd-bbfe-3dabce05a288	2014-02-07 19:53:01 +00:00
gregory.hainaut	c0558c00e7	gsdx ogl: * gui refresh + Use some tab to reduce heigth for small screen + Add logz option + remove broken/experimental keyword. GSdx ogl is not too bad ;) * autodetect GL_NV_depth_buffer_float Linux tester you are welcome! git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5862 96395faa-99c1-11dd-bbfe-3dabce05a288	2014-02-01 11:11:14 +00:00
gregory.hainaut	84895eadd9	gsdx ogl: correct most of Z-depth issue Best setting if you driver support GL_NV_depth_buffer_float => GL_NV_Depth = 1 & logz = 0 Otherwise => GL_NV_Depth = 0 & logz = 1 Explanation of the bug: Dx z position ranges from 0.0f to 1.0f (FS ranges 0.0f to 1.0f) GL z Position ranges from -1.0f to 1.0f (FS ranges 0.0f to 1.0f) Why it sucks: GS small depth value will be "mapped" to -1.0f. In others all small values will be 1.0f! Terrible lost of accuraccy. The GL_NV_depth_buffer_float extension allow to set the near plane as -1.0f. So "GL z Position ranges from -1.0f to 1.0f (FS ranges 0.0f to 1.0f)" will become "GL z Position ranges from -1.0f to 1.0f (FS ranges -1.0f to 1.0f)" and therefore "z posision [0.0f;1.0f] will map to FS [0.0f;1.0f]" as DX Yes we just get back all precision lost previously :) However you need hardware (intel?) and driver support (free driver?/gles?) :( git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5860 96395faa-99c1-11dd-bbfe-3dabce05a288	2014-01-31 21:30:54 +00:00
gregory.hainaut	c2aa4ff3fd	gsdx ogl: * restore the old fxaa (Asmodeam will be integrated when I got time) * port the recently added new scanline algo git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5818 96395faa-99c1-11dd-bbfe-3dabce05a288	2014-01-18 14:46:13 +00:00
gregory.hainaut	2238095a82	gsdx ogl: do the same as previous commit but for ogl ;) * fix the missing auto interlace opt when cycling with hotkey on linux git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5810 96395faa-99c1-11dd-bbfe-3dabce05a288	2014-01-12 11:38:50 +00:00
gregory.hainaut	e80b002929	gsdx ogl: Flush various pending work * try to use more subroutine on VS&PS, unfortunately hit a driver crash! * Call Attach/DetachContext through GSDevice so I can unmap currently mapped buffer * Implement glsl part of GL_ARB_bindless texture, again hit another driver crash! * various fix of GL_ARB_buffer_storage. Basic benchmark show only improvement on 'cold' case, I guess it will improve smoothness * try to fix GL_clear_texture, no success so far. It seem the extension is limited to basic texture (aka no depth/stencil) git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5752 96395faa-99c1-11dd-bbfe-3dabce05a288	2013-10-24 20:54:27 +00:00
gregory.hainaut	e01c6cd9ce	gsdx ogl: the proof of concept commit * GL_ARB_shader_subroutine for perf fix for nvidia => add missing shader declaration. Nvidia got +4fps on colin3 :) For the moment only 2 PS parameters are supported. Code need to be extended to support others games that often switch shader program (like xenosaga). require GL4 class hardware and the option override_GL_ARB_shader_subroutine = 1 Note: strangely on AMD linux it is slower! * GL_ARB_shader_image_load_store for accuraccy (Date) Use a signed integer texture and reenable color buffer writing Current status: Amagami_transparency.gs & P3_battle_shadows.gs are now working on Nvidia with a small perf impact. Current implementation detail: 1/ setup the standard stencil as before 2/ on remaining pixel, draw once to compute first primitive that will write a fail alpha value. 3/ final draw based on primitive id of step 2 Note: I think we would get a bad behavior if depth test&mask are enabled on step 2/3 Note2: on my limited testcase the perf impact was on CPU. It would be possible to merge step1&2 to nullifying it (could even be faster actually), however it would require more GPU power. Again require GL4 class hardware. And the option UserHacks_DateGL4 = 1 git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5725 96395faa-99c1-11dd-bbfe-3dabce05a288	2013-08-28 08:44:16 +00:00
gregory.hainaut	07605941ef	gsdx ogl: * some preliminary work to test/benchmark bindless texture in the future (glsl was not yet updated) Bindless texture allow to get a GPU texture pointer and then set it directly to the shader as a basic uniform. => no more texture unit selection/validation => no more texture validation neither texture hash lookup 3rdparty: update gl header to the latest gl4.4 git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5720 96395faa-99c1-11dd-bbfe-3dabce05a288	2013-08-17 08:57:52 +00:00
gregory.hainaut	b4084047be	gsdx ogl: Used a basic flat interpolation for color interpolation (line & tri primitives) Card that support gs: remain only a gs to generate sprite from a line. Even dummy gs are costly for the GPU. Card that don't support gs: remove useless copy of color for line and triangle primitives Note for dx: opengl 3.2 (maybe not gles) supports both flat interpolation convention (GL_FIRST_VERTEX_CONVENTION or GL_LAST_VERTEX_CONVENTION). It might be possible to shuffle vertex index to put the last vertex in first position. - buff[0] = head + 0; - buff[1] = head + 1; - buff[2] = head + 2; + buff[0] = head + 2; + buff[1] = head + 1; + buff[2] = head + 0; git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5718 96395faa-99c1-11dd-bbfe-3dabce05a288	2013-08-14 10:18:38 +00:00
gregory.hainaut	0f603a98d5	gsdx ogl: Test the ARB_shader_subroutine GL4.0 extension The idea was to replace shader program swith by pointer function calls inside shaders. At least parameters that are often changed between draw call. So far I only ported atst and colclip. Unfortunately code is "slower" (on GSdx standalone). For the moment keep the code but disabled. If I understand well the validation of program is done in the "driver thread" but the additional call are done in the overloaded MTGS thread. Apitrace profiling shows faster GPU draw calls. Another possibility is that the driver still need to validate the draw call because of others state change. Here some stats on colin3 (90 frames): without subroutine: UseProgram 125246 with subroutine: UseProgram 2906, subroutine 125945 => 3605 extra calls overhead (not all parameters are ported to subroutine) git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5715 96395faa-99c1-11dd-bbfe-3dabce05a288	2013-08-10 19:43:59 +00:00
gregory.hainaut	a46b489a24	gsdx ogl: various minor optimization. * move most of gl states into a separate namespace. Extend it to depth/stencil/blend micro state => save 10,000 opengl call by frame for colin mcrae 3 * Only setup blend state of first drawbuffer * Don't request anymore a debug context on dev/release build git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5713 96395faa-99c1-11dd-bbfe-3dabce05a288	2013-08-05 20:25:25 +00:00
gregory.hainaut	34045eb8f7	gsdx ogl: AMD users upgrade to 13.8 now ;) * clean extension management and fix compilation of previous gl44 code. * Use pixel buffer object to upload texture data. => avoid crash on AMD driver => a bit faster and probably got some margins for the future git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5712 96395faa-99c1-11dd-bbfe-3dabce05a288	2013-08-03 08:29:01 +00:00

1 2 3 4

156 Commits