+ Implement threads/mutex using std::thread/std::mutex/std::condition_variable
+ Old implementation can be used by commenting out the "define _STD_THREAD_ in GSThread.h
+ Commit should be much cleaner than previous one.
Note: of course it requires a glsl shader ;)
On windows, you can set the path on the ini file. Here an example with linux path:
shaderfx_conf = /home/gregory/playstation/emulateur/pcsx2_merge/bin/GSdx_FX_Settings.ini
shaderfx_glsl = /home/gregory/playstation/emulateur/pcsx2_merge/bin/shader.fx
I need to check carefully the consequence of ABI change. So far wx is very unhappy!
Fatal Error: Mismatch between the program and library build versions detected.
The library used 2.8 (no debug,Unicode,compiler with C++ ABI 1002,wx containers,compatible with 2.6),
and your program used 2.8 (no debug,Unicode,compiler with C++ ABI 1006,wx containers,compatible with 2.6).
Copy the full line into the pbo. Dma will only take GL_UNPACK_ROW_LENGTH
- increase memcpy size by 2 in the pbo
+ single memcpy will be faster and can use sse
Enable buffer_storage extension:
* GL_CLIENT_STORAGE_BIT was required (it is the duty of TexSubImage to copy data into the GPU mem)
* Enable the extension by default
Null is equivalent to a clear to 0.
Note: Code is not yet used because both stencil and depth are cleared.
Future note: stencil can potentially be replaced by load_store_image
* Don't use the stack
* Don't compare MinMax parameter (depends of others)
* Don't store not-compared parameter in the cache (HalfTexel/MinMax)
+0.3fps/46fps (well better than nothing)
* Only a single VAO
=> Format is set once
=> Only a single bind at startup
=> GSVertexBufferStateOGL is nearly useless
=> barely faster but better than nothing :)
A couple of fallbacks were introduced for the Mesa driver that only support 3.0
DSA will require a recent Mesa which already support GL3.3
Require at least SandyBridge for Intel GPU
* use DebugOutputToFile as a callback of gl error. Add a breakpoint to
find the culprit GL call
* use string instead of char[n]
Note: CheckDebugLog is potentially useless now
* GL_ARB_clip_control: reduce z fighting
* GL_ARB_clear_texture: no real speed gain (but improve code quality)
* GL_ARB_bindless_texture: +1fps (if you're CPU limited)
* require a 3.1 context
* unattach texture of the fbo when they're not used
(avoid to have a texture and depth_stencil with different size)
Note: except minor shader bug it works on Nvidia 340.23.01
This doesn't update the file to the latest version from mingw32 since this is already a custom header stripped from mingw32.
This also fixes the functions for x86_64(verified that it still works for both architectures) and also updates the version inside of GSdx.
Also removes a comment in the GSdx header saying that these functions are broken since they no longer are.
deassert gsopen_done before delete the gs object. Compiler and CPU will
reordonate the store (or not because it is a global variable). Probably not
perfect but better.
Got a crash on GSKeyEvent which I guess wasn't thread safe. So let's ensure
gs is open.
There is no such thing as MGS3 Substance only Subsistence. Substance is MGS2 :P
Also this new CRC is US Disc 2 of Subsistence. The blue stripes should be fixed now I guess.
1/ initReadFifo will be first called on the GS thread
(openGL can only be done on the GS thread)
2/ readFifo will be called on the EE thread. It is not safe too access eeMem from GS
because of memory is virtual
Fix "recent" regression (crash) on Kingdom heart and others game too.
v2: add a len check on GSState::InitReadFIFO
* Fix to create a 3.0 GLES context
* set a default precision to fix most of shader compilation issue
* Crash later because of GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT
=> need to test opensource driver
Remains 3 files that I don't know the source
pcsx2/windows/DwmSetup.cpp: *No copyright* UNKNOWN
pcsx2/windows/SamplProf.cpp: *No copyright* UNKNOWN
pcsx2/windows/VCprojects/IopSif.cpp: *No copyright* UNKNOWN
Remains 1 files in common that I don't know the source
common/include/comptr.h: *No copyright* UNKNOWN
Remains too much files in plugins that I don't know the source :(
Add the --gles build option to the linux main script
Ifdef all gl code not supported on gles3 (note some will be reenabled for gles3.1)
Note: it probably doesn't run anymore. My Nvidia driver doesn't support
yet egl/gles so I can't test it. Feel free to contribute.
A bit slower. Maybe because SubData does the copy in the driver thread. My memcpy is done on
the main thread. I'm not sure it would worth an extra thread to copy vertex data to the GPU
Note: testers are welcome. You need to edit the ini file.
"ogl_vertex_storage=1" <= enable the extension
"ogl_vertex_storage=0" <= disable the extension
Again you need the support of GL_arb_buffer_storage (i.e. not catalyst)
Improve arb_buffer_storage implementation
Try harder to align the texture buffer
Strangely arb_buffer_storage is 3 times slower on my PC (nvidia)
Tester are welcome! Open the ini file
"ogl_texture_storage = 1" <= enable the extension
"ogl_texture_storage = 0" <= disable the extension
Note: you need an opengl 4.4 driver or one that support arb_buffer_storage (i.e. not catalyst)
This hack was used because GSReadFifo was called from the EE thread.
Previous commit move the call to the GSThread.
Hopefully avoid flushing the full GPU contex would improve openGL
performance (at least avoid some hiccups ;) )
Note: newer GSdx ogl won't be compatible with older PCSX2
Crashfix for a weird GIF_FLG_IMAGE2 situation in Wallace and Gromit Project Zoo. Needs further investigation.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5912 96395faa-99c1-11dd-bbfe-3dabce05a288
reduce_gl_requirement_for_free_driver => set it to 1 to only required a 3.0 context (you still required the good extensions)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5910 96395faa-99c1-11dd-bbfe-3dabce05a288
Reformatted fx files that were causing issues on certain text editors. They should now display correctly in those editors.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5897 96395faa-99c1-11dd-bbfe-3dabce05a288
Also removed the fallback recovery ps, and replaced the compile fail catch to a simple console print. Which I think is safer, and faster.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5894 96395faa-99c1-11dd-bbfe-3dabce05a288
UserHacks_AlphaStencil will take precedence on override_GL_ARB_shader_image_load_store
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5891 96395faa-99c1-11dd-bbfe-3dabce05a288
Note: DATE is used for shadows (persona 3) and others effect
You can disable it when you disable gl image extension (override_GL_ARB_shader_image_load_store = 0)
You can also enable it on AMD too (set it to 1 this time) but no guarantee.
If you feel the extension is too slow, you can try disable some gl barrier (aka "damn the torpedo full steam ahead!").
It can be done with the option UserHacks_DateGL4 = 1. Otherwise just disable the extension.
Note: don't enable UserHacks_AlphaStencil in the same time. GL_ARB_shader_image_load_store is an alternative implementation.
Enabling both in the same time will lead to undefined (well surely wrong) behavior.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5890 96395faa-99c1-11dd-bbfe-3dabce05a288
* properly detect gl nv depth extension
* Always show the hack on the gui. Add a new hack option for DATE (gl4.2) only
* Save the scan mode on linux too (f7)
* hopefully fix some crash on some drivers... (ensure aligment 256 bits alignment, and if not use std memcpy)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5888 96395faa-99c1-11dd-bbfe-3dabce05a288
Also disabled the gsdx AF options for the OGL renderer (because it's not implemented for that yet).
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5881 96395faa-99c1-11dd-bbfe-3dabce05a288
Fixes a small oversight on my part. Which was setting the maximum anisotropy to (0,1,2,3,4) respectively, instead of (1,2,4,8,16). So when you selected x16 for example, you were actually only getting x4 >.>
Corrected now, and you will be getting the full x16 when selected :)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5879 96395faa-99c1-11dd-bbfe-3dabce05a288
Adds anisotropic texture filtering (1x-16x) to the hardware settings. Enhances the visual quality of textures that are at oblique viewing angles.
Anisotropic filtering is automatically disabled if: 8-bit textures are enabled.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5878 96395faa-99c1-11dd-bbfe-3dabce05a288
As before: use the hotkey(f7) to switch between them. It will now save the chosen scanline effect, instead of always defaulting to disabled, on load.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5874 96395faa-99c1-11dd-bbfe-3dabce05a288
* gui refresh
+ Use some tab to reduce heigth for small screen
+ Add logz option
+ remove broken/experimental keyword. GSdx ogl is not too bad ;)
* autodetect GL_NV_depth_buffer_float
Linux tester you are welcome!
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5862 96395faa-99c1-11dd-bbfe-3dabce05a288
Best setting if you driver support GL_NV_depth_buffer_float => GL_NV_Depth = 1 & logz = 0
Otherwise => GL_NV_Depth = 0 & logz = 1
Explanation of the bug:
Dx z position ranges from 0.0f to 1.0f (FS ranges 0.0f to 1.0f)
GL z Position ranges from -1.0f to 1.0f (FS ranges 0.0f to 1.0f)
Why it sucks:
GS small depth value will be "mapped" to -1.0f. In others all small values will be 1.0f! Terrible lost
of accuraccy.
The GL_NV_depth_buffer_float extension allow to set the near plane as -1.0f.
So
"GL z Position ranges from -1.0f to 1.0f (FS ranges 0.0f to 1.0f)"
will become
"GL z Position ranges from -1.0f to 1.0f (FS ranges -1.0f to 1.0f)"
and therefore
"z posision [0.0f;1.0f] will map to FS [0.0f;1.0f]" as DX
Yes we just get back all precision lost previously :)
However you need hardware (intel?) and driver support (free driver?/gles?) :(
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5860 96395faa-99c1-11dd-bbfe-3dabce05a288
* use same path as game index db for cheats and cheats_ws
* install the new cheat zip file on cmake and debian installer
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5850 96395faa-99c1-11dd-bbfe-3dabce05a288
This fixes the crashes on F9 for me. We don't want to call GSgetTitleInfo2() when GSOpen() isn't finished yet.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5834 96395faa-99c1-11dd-bbfe-3dabce05a288
HTB123 on our forums came up with a hack that removes the broken main character shadow in hardware rendering
in SMT Nocturne.
This should work for all GSdx recognized versions / regions of the game (tested EU and NTSC-U).
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5819 96395faa-99c1-11dd-bbfe-3dabce05a288
* restore the old fxaa (Asmodeam will be integrated when I got time)
* port the recently added new scanline algo
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5818 96395faa-99c1-11dd-bbfe-3dabce05a288
Add another scanline algorithm, made by pseudonym. This one is pretty fancy, using a cosinus function for generating the dark lines.
It only works on unscaled output, so use software rendering or native resolution for hardware.
It's an effect that works best on 240p converted games (like the SNES titles in Mega Man Collection).
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5812 96395faa-99c1-11dd-bbfe-3dabce05a288
Finish up my scanlines attempt. Now the 2 old shaders are back in and all 3 cycle on F7.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5809 96395faa-99c1-11dd-bbfe-3dabce05a288
Slight adjustments to positions in the GUI also (OCD'd the spacing a little :P)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5796 96395faa-99c1-11dd-bbfe-3dabce05a288
- FXAA shader replaced by Asmodeans shader suite 'PCSX2 Fx 2.00 Revised'.
This is an entire enhancement suite with many configurable options like FXAA, texture sharpening
or lighting tweaks. Some of the effects are pretty advanced so kudos to him allowing it to be integrated with GSdx! :)
Note that there's no interface for the tweaks yet. Until those are done, the defaults will be used.
Release thread here: http://forums.pcsx2.net/Thread-Custom-Shaders-for-GSdx?pid=334766#pid334766
- Disabled an option from the widescreen patches for SH:Shattered Memories.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5793 96395faa-99c1-11dd-bbfe-3dabce05a288
* allow to control the gl dectection from the gui (+1 for the user-friendliness ;) )
* disable geometry shader by default on Intel driver
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5783 96395faa-99c1-11dd-bbfe-3dabce05a288
Added a check box and config code for the fxaa shader.
It's not currently hooked up since the shader setup might get replaced in a day or two.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5777 96395faa-99c1-11dd-bbfe-3dabce05a288
Added a simple scanlines filter (2nd one activated by F7). No tweaking done yet.
It shows how to get scanlines that aren't affected by input resolution or scaling.
A good next step would be determining the exact number of dark lines to show :)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5770 96395faa-99c1-11dd-bbfe-3dabce05a288
Removed some missing headers from the vs2010 and vs2012 project files that were causing vs to always claim the projects were out of date.
Also removed some other entries for c/cpp files that were disabled but also missing (I did not search exhaustively).
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5763 96395faa-99c1-11dd-bbfe-3dabce05a288
(Did this through the google code file editing so if it's broken i apologise :P )
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5762 96395faa-99c1-11dd-bbfe-3dabce05a288
* try to use more subroutine on VS&PS, unfortunately hit a driver crash!
* Call Attach/DetachContext through GSDevice so I can unmap currently mapped buffer
* Implement glsl part of GL_ARB_bindless texture, again hit another driver crash!
* various fix of GL_ARB_buffer_storage. Basic benchmark show only improvement on 'cold' case, I guess it will improve smoothness
* try to fix GL_clear_texture, no success so far. It seem the extension is limited to basic texture (aka no depth/stencil)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5752 96395faa-99c1-11dd-bbfe-3dabce05a288
Performance note, it might be faster to replace the MODULO with an AND. Not sure on the impact
for the new time stretcher algo.
GSdx: fixed use-after-free (linux)
PCSX2:
* add a define to support address sanitizer (both rely on 0x20000000-0x3fffffff memory ranges..)
* sio_buffer out of bond (-1). Maybe we can move the flush in the 2 if previous branch. It would
avoid the extra test.
* wxGetEnv (linux) generates double free (maybe not thread safe). Cache the result so it doesn't crash
anymore when switching renderer
Comment/Review/Improvement are welcome :)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5727 96395faa-99c1-11dd-bbfe-3dabce05a288
* GL_ARB_shader_subroutine for perf
fix for nvidia => add missing shader declaration. Nvidia got +4fps on colin3 :)
For the moment only 2 PS parameters are supported. Code need to be extended to support others games that often
switch shader program (like xenosaga).
require GL4 class hardware and the option override_GL_ARB_shader_subroutine = 1
Note: strangely on AMD linux it is slower!
* GL_ARB_shader_image_load_store for accuraccy (Date)
Use a signed integer texture and reenable color buffer writing
Current status: Amagami_transparency.gs & P3_battle_shadows.gs are now working on Nvidia with a small perf impact.
Current implementation detail:
1/ setup the standard stencil as before
2/ on remaining pixel, draw once to compute first primitive that will write a fail alpha value.
3/ final draw based on primitive id of step 2
Note: I think we would get a bad behavior if depth test&mask are enabled on step 2/3
Note2: on my limited testcase the perf impact was on CPU. It would be possible to merge step1&2 to nullifying it (could
even be faster actually), however it would require more GPU power.
Again require GL4 class hardware. And the option UserHacks_DateGL4 = 1
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5725 96395faa-99c1-11dd-bbfe-3dabce05a288
zzogl: fix memory leak (fix#1431, #1432)
GSdx ogl: disable geometry shader on Nvidia/Windows (I will wait a 3rd implementation to find which one is correct)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5724 96395faa-99c1-11dd-bbfe-3dabce05a288
* redo most of the texture upload (PBO): colin3 benchmark: 32 fps now (vs 26 fps 2 weeks ago)
* use the cross vendor vsync extension on linux (previous wasn't supported by nvidia)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5721 96395faa-99c1-11dd-bbfe-3dabce05a288
* some preliminary work to test/benchmark bindless texture in the future (glsl was not yet updated)
Bindless texture allow to get a GPU texture pointer and then set it directly
to the shader as a basic uniform.
=> no more texture unit selection/validation
=> no more texture validation neither texture hash lookup
3rdparty: update gl header to the latest gl4.4
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5720 96395faa-99c1-11dd-bbfe-3dabce05a288
Card that support gs:
remain only a gs to generate sprite from a line. Even dummy gs are costly for the GPU.
Card that don't support gs:
remove useless copy of color for line and triangle primitives
Note for dx: opengl 3.2 (maybe not gles) supports both flat interpolation
convention (GL_FIRST_VERTEX_CONVENTION or GL_LAST_VERTEX_CONVENTION). It might
be possible to shuffle vertex index to put the last vertex in first position.
- buff[0] = head + 0;
- buff[1] = head + 1;
- buff[2] = head + 2;
+ buff[0] = head + 2;
+ buff[1] = head + 1;
+ buff[2] = head + 0;
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5718 96395faa-99c1-11dd-bbfe-3dabce05a288
The idea was to replace shader program swith by pointer function calls inside
shaders. At least parameters that are often changed between draw call. So far
I only ported atst and colclip. Unfortunately code is "slower" (on GSdx standalone).
For the moment keep the code but disabled.
If I understand well the validation of program is done in the "driver thread"
but the additional call are done in the overloaded MTGS thread. Apitrace
profiling shows faster GPU draw calls. Another possibility is that the driver still
need to validate the draw call because of others state change.
Here some stats on colin3 (90 frames):
without subroutine: UseProgram 125246
with subroutine: UseProgram 2906, subroutine 125945 => 3605 extra calls overhead (not
all parameters are ported to subroutine)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5715 96395faa-99c1-11dd-bbfe-3dabce05a288
Note: I think we can do the same on DX11
Perf wise: on colin mcrae 3 it reduces shader prog setup from 3005 to 2086 each frames. It saves 2 ms of CPU processing (27->29fps)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5714 96395faa-99c1-11dd-bbfe-3dabce05a288
* move most of gl states into a separate namespace. Extend it to depth/stencil/blend micro state
=> save 10,000 opengl call by frame for colin mcrae 3
* Only setup blend state of first drawbuffer
* Don't request anymore a debug context on dev/release build
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5713 96395faa-99c1-11dd-bbfe-3dabce05a288
* clean extension management and fix compilation of previous gl44 code.
* Use pixel buffer object to upload texture data.
=> avoid crash on AMD driver
=> a bit faster and probably got some margins for the future
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5712 96395faa-99c1-11dd-bbfe-3dabce05a288
* preliminary work for GL4.4 extensions (ARB_clear_texture & ARB_multi_bind). Disabled until I got a 4.4 driver
Note: I plan also to use ARB_buffer_storage
* compute texture gl option in the constructor (avoid a couple of swith case)
* redo texture unit management. Unit 0-2 for shaders, Unit 3 for texture operations. MultiBind will allow to bind
shader input without disturbing texture binding points.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5711 96395faa-99c1-11dd-bbfe-3dabce05a288