Initially we copy pitch by line in the PBO and tell the dma to only
use the first valid byte.
Now, we only copy useful data to the PBO. It reduce the copy and PBO memory requirement.
It seems a bit faster on native resolution
0: don't dump input texture
1: dump input texture
Now, you can do a first pass with only RT. When you find the wrong call, you can redump the input texture of the bad draw.
deassert gsopen_done before delete the gs object. Compiler and CPU will
reordonate the store (or not because it is a global variable). Probably not
perfect but better.
Got a crash on GSKeyEvent which I guess wasn't thread safe. So let's ensure
gs is open.
1/ initReadFifo will be first called on the GS thread
(openGL can only be done on the GS thread)
2/ readFifo will be called on the EE thread. It is not safe too access eeMem from GS
because of memory is virtual
Fix "recent" regression (crash) on Kingdom heart and others game too.
v2: add a len check on GSState::InitReadFIFO
This hack was used because GSReadFifo was called from the EE thread.
Previous commit move the call to the GSThread.
Hopefully avoid flushing the full GPU contex would improve openGL
performance (at least avoid some hiccups ;) )
Note: newer GSdx ogl won't be compatible with older PCSX2
This fixes the crashes on F9 for me. We don't want to call GSgetTitleInfo2() when GSOpen() isn't finished yet.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5834 96395faa-99c1-11dd-bbfe-3dabce05a288
* try to use more subroutine on VS&PS, unfortunately hit a driver crash!
* Call Attach/DetachContext through GSDevice so I can unmap currently mapped buffer
* Implement glsl part of GL_ARB_bindless texture, again hit another driver crash!
* various fix of GL_ARB_buffer_storage. Basic benchmark show only improvement on 'cold' case, I guess it will improve smoothness
* try to fix GL_clear_texture, no success so far. It seem the extension is limited to basic texture (aka no depth/stencil)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5752 96395faa-99c1-11dd-bbfe-3dabce05a288
* GL_ARB_shader_subroutine for perf
fix for nvidia => add missing shader declaration. Nvidia got +4fps on colin3 :)
For the moment only 2 PS parameters are supported. Code need to be extended to support others games that often
switch shader program (like xenosaga).
require GL4 class hardware and the option override_GL_ARB_shader_subroutine = 1
Note: strangely on AMD linux it is slower!
* GL_ARB_shader_image_load_store for accuraccy (Date)
Use a signed integer texture and reenable color buffer writing
Current status: Amagami_transparency.gs & P3_battle_shadows.gs are now working on Nvidia with a small perf impact.
Current implementation detail:
1/ setup the standard stencil as before
2/ on remaining pixel, draw once to compute first primitive that will write a fail alpha value.
3/ final draw based on primitive id of step 2
Note: I think we would get a bad behavior if depth test&mask are enabled on step 2/3
Note2: on my limited testcase the perf impact was on CPU. It would be possible to merge step1&2 to nullifying it (could
even be faster actually), however it would require more GPU power.
Again require GL4 class hardware. And the option UserHacks_DateGL4 = 1
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5725 96395faa-99c1-11dd-bbfe-3dabce05a288
* add a non-working hack: UserHacks_DateGL4, goal was to replace UserHacks_AlphaStencil
+ Detection of good/bad samples is based on primitive ID variable. However I'm not sure
the behavior is always the same between draw call...Anyway let's keep a copy of the current
work
* Dump integer texture into text csv
* add gl4.2 ARB_shader_image_load_store extension (needed by UserHacks_DateGL4)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5707 96395faa-99c1-11dd-bbfe-3dabce05a288
* allow to switch renderer with F9
* skip first frame in stat of the replayer
* drop msaa. Fxaa and internal resolution will do the job
* move texture attachment from texture object into device object (allow to keep sanely the state)
* split the write buffer and attachment setup
* completely split sampler and texture input setup
* redo GSDeviceOGL::CopyOffscreen to avoid an extra copy.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5704 96395faa-99c1-11dd-bbfe-3dabce05a288
* use gles header file, disable opengl code (mainly GLX, ARB_sso, geometry shader)
* Define properly the function pointer, GLES use basic linking whereas GL must get the symbol dynamically
* cmake: properly search and set libglesv2.so
* don't use dual source blending => HW renderer work (only miss unimportant FBA)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5701 96395faa-99c1-11dd-bbfe-3dabce05a288
* use svnrev.h on linux too
* replace sprintf_s with snprintf (hope it still compile on Windows)
* init integer with 0 instead of NULL
* various int -> u32/uint32/uint on for loop index
* remove a couple of unused variable
* init few variable
* disable unused warning results
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5683 96395faa-99c1-11dd-bbfe-3dabce05a288
* Separate state and shader compilation into separate function
* replace various hash_map by basic array
* Compact VertexScale and offset into a single vec4
* add the new option "ogl_vertex_subdata": subdata is faster on FGLRX, test are welcome on Nvidia drivers
0 => use map/unmap
1 => use subdata
replay: add "linux_replay" option and compute some nice stat (mean, standard deviation)
cmake: recreate shader header at build time
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5682 96395faa-99c1-11dd-bbfe-3dabce05a288
* don't delete the wnd in GSclose. It can still be used later
* Properly detect the GL_ARB_gpu_shader5 extension for Fxaa
* A couple of fix in Create (GSopen1) of GSWndEGL/GSWndOGL
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5674 96395faa-99c1-11dd-bbfe-3dabce05a288
* add exception in GSWndEGL and GSWndOGL
* Try to use EGL when GLX failed => you don't need any flags for the opensource driver ;)
cmake: don't install anymore the glsl shader
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5671 96395faa-99c1-11dd-bbfe-3dabce05a288
* update linux dialog: create a custom shader box and put it Shade boost and Fxaa
* force the reloading of the inifile after any configuration update.
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5666 96395faa-99c1-11dd-bbfe-3dabce05a288
* factorize sample object creation
* remember frame buffer attachment state
* Use a basic context on EGL. Allow to use Mesa 9.1 on AMD GPU.
* precompile vertex and geometry shader to avoid benchmark polution on replay
* Try harder to detect FGLRX driver on window
* various clean
Remain to fix the coordinate system for upscaling
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5651 96395faa-99c1-11dd-bbfe-3dabce05a288
* implement GSWndEGL copy/pasted from zzogl.
* remove linux debugging shader that failed to compile on window
* add additional debug message during context creation
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5520 96395faa-99c1-11dd-bbfe-3dabce05a288
* Remove bad redefinition of glActiveTexture and glBlendColor pointer
* Remove glew dependency on linux
* s/OGL_DEBUG/ENABLE_OGL_DEBUG/ and enable it on debug
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5512 96395faa-99c1-11dd-bbfe-3dabce05a288
* move GL loading function into its own namespace
* Use manual GL loading on linux too and mostly drop glew dependency. (Not 100% clean we are still depends on some glew define, it will be fixed later)
Note: shaders still need to be copied manually from GSdx/res to bin/plugins
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5510 96395faa-99c1-11dd-bbfe-3dabce05a288
* Update project files
* basic compilation fix: include stdafx, s/uint/uint32/
* add selection of the opengl renderer/device in gsopen
Remain to fix opengl function declaration/initialization
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5505 96395faa-99c1-11dd-bbfe-3dabce05a288
cmake: take the opportunity to drop the support of 3rdparty compilation. Distributions have got a more recent version of zlib/soundtouch anyway.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5376 96395faa-99c1-11dd-bbfe-3dabce05a288
- Removed the #define DISABLE_CRC_HACKS (since it's at the GUI now).
- reverted r5315 and r5319 (which prevented gs dumps to have a correct CRC).
- Restored the functionality of these revisions via simple skip of the other hack calls.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5320 96395faa-99c1-11dd-bbfe-3dabce05a288
gsdx:
* add some parenthesis to shup up very verbose gcc warning
* adapt ogl to latest sudonim change
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5290 96395faa-99c1-11dd-bbfe-3dabce05a288
* properly delete program and vertex array. Avoid a crash on plugin reload
* reset shader state. Avoid to reuse invalid data on plugin reload
gsdx:
* add an hack to unattach/attach the gl context from different thread. Help to solve some crashes. The best will be to move gpu operation out of gsreadfifo but it would need more works
* implement logz for test purpose (don't seem to help)
gsdx replay:
* use default xdg location
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5289 96395faa-99c1-11dd-bbfe-3dabce05a288
* for the moment only the SW render is supported, hopefully HW will come some day. And linux only for the moment.
* Require an OpenGL3 GPU (==Dx10) ie Nvidia >= 8800, AMD >= HD2000)
* Require an OpenGL4.2 compatible drivers => no opensource driver supported neither Intel driver.
* Build by default without SDL support which will dropped later. You need to add this define "ENABLE_SDL_DEV" on Win.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5186 96395faa-99c1-11dd-bbfe-3dabce05a288
Previously: GSdx was switching between the configured renderer and the best SW renderer (best = DX11 if supported).
Now: If using DX: Switch SW/HW renderer and use the same DX version. If using SDL: same as before.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5164 96395faa-99c1-11dd-bbfe-3dabce05a288
* Keep the state of the draw buffer (save few opengl call)
* AMD fix the shader unloading (12.2 and above). So disable the workaround
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@5137 96395faa-99c1-11dd-bbfe-3dabce05a288
* Use the new map interface/separate texture coordinate inside shader
* support new format on texture
Note: it is quite instable with various crashes and GL error but at least it compiles now :p
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@5094 96395faa-99c1-11dd-bbfe-3dabce05a288
* implement some missing shader for DATE, invert coordinate like strech rectangle
* Use glCopyImageSubDataNV nvidia extension to copy image (you need latest AMD drivers)
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@5086 96395faa-99c1-11dd-bbfe-3dabce05a288
* Fix some issue on wnd management, mix between sdl/ogl render
* create new gsdx option for ogl debug
+ debug_ogl_dump: start frame to dump when not 0
+ debug_ogl_dump_length: length of the dump
+ debug_ogl_shader: print shader debug
Note current dump option must be fixed to use linux path.
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@5067 96395faa-99c1-11dd-bbfe-3dabce05a288
* fix bad setup issue for constant blend factor
* Use a read framebuffer to read back the texture (less disruptive)
* cmake separate the loader to the main plugins
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@5064 96395faa-99c1-11dd-bbfe-3dabce05a288
* add some define to enable/disable SDL so we could build gsdx without SDL
* debug: dump data based on frame count rather from draw count
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@5044 96395faa-99c1-11dd-bbfe-3dabce05a288
* move all vertex buffer stuff into the class
* Bypass FIFO2 because of multithread issue with OGL(temporary workaround until we found a nicer solutions)
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@5035 96395faa-99c1-11dd-bbfe-3dabce05a288
- 0: no multi-threading
- 1: gif packet processing and texture uploads run parallel with rendering, the slowest decides the fps, dual-cores can still suffer by the spin loops, I'll check that when I compile pcsx2 on my notebook
- 2: two rendering threads, on a decent cpu packet processing is going to be slower now, this is probably going to increase fps the most on quads
- 3: small fps increase
- 4+: even smaller.
If you have a quad cpu with HT, 6 is the max, 1 + 1 is needed for pcsx2 and gsdx's basic tasks.
Also hacked palette writes to not force a read-back in hw mode (added in previous rev), it hit render targets in a surprising large number of games.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@4998 96395faa-99c1-11dd-bbfe-3dabce05a288
* add a callback for GLERROR. Allow to breakpoint on GSDeviceOGL::DebugCallback (gdb is completely lost on amd driver, hope it is better on nvidia)
* Add some empty glsl convert to shutup some useless debugging error
* request an advance opengl context without pre 3.0 feature.
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@4983 96395faa-99c1-11dd-bbfe-3dabce05a288