ogl_texture_storage 1 creates texture corruption.
Advance date is too slow, code need to be updated (properly) to uses 2 passes only not 3
Maybe one could be enough (sometimes)
DATE is implemented in 2 ways.
1/ with stencil
2/ purely in FS (sw)
I kept method 1 to reduce the work on method 2. It sucks for performance.
So it would be either 1 or 2.
Note: DATE has a big impact on higher upscaling
Note2: you can disable the 2nd method with this configuration parameter
override_GL_ARB_shader_image_load_store = 0
Note yet enabled because I'm afraid of data corruption but feel free to test it
The option:
ogl_vertex_storage = 1
Performance note (warm cache+gs replay on colin3)
60 fps -> 76 fps
UserHacks_UnscaleSprite = 1 will unscale flat sprites
UserHacks_UnscaleSprite = 2 will unscale all sprites (don't work well so far)
The idea of the hack is to redo the interpolation of texture coordinate
based on the non-upscaled pixel position.
It avoids various glitches but sprites aren't upscaled anymore (so no
more anti-aliasing, potentially a coefficient can be added).
* separate VS/GS and FS
* separate subroutine part of the FS
It already complex enough without subroutine stuff. Besides I'm not sure
we will keep subroutine on the future.
If someone has a more elegant solution, feel free to share it
spin_thread = 0
spin_thread = 1 // the faster but GS thread will never stop, very bad for laptop
It is faster on linux, it requires less code, and it is "portable"
It requires boost (only hpp files) + MSVC 2013 (for atomic) (seem doable by 2012 too)
Actually there are several queues that either use spinlock or full sleep
* Use tooltip for hack
* Update string of previous hack
* Remove unused hack
Note: hack_sprite_check requires 3 states (and potentially others hack too) but
I don't know how to do it. It likely requires a "scale button"
It works as bad as a "clever" implementation.
It seems to be enough for games such as venus/taisho-monoke/FFX
Note: it might creates glitches. Code will never be nice, so it is just
a trade-off
-1 is annoying because minimum value is 0. Instead to add more logic,
let's try to use 0 which seems to be good enough (fix regressions on DQ8/AT)
Unfortunately it causes a mini regression on taisho-mononoke. Rotation of sprite is done with 2 triangles.
Potentially previous value wrongly recover this section.
Tekken is broken with setting UserHacks_round_sprite_offset = 2 use 1 instead.
=> game use upscaling internaly so my rounding of coordinate break everything
Yakuza uses float coordinate but this hack only correct the integer coordinate
=> the solution will be to add a dedicated hack for this game.
Colin3 got a regression but I don't know when...
PS2 uses a -0.5 offset before sampling so texels must be rounded to half-pixel boundary
If fixes glitches on Venus and taisho-mononoke
Note: I didn't fixed yet texture render in reverse because I don't have any test for it.
UserHacks_round_sprite_offset = 1 <= enable correction of flat sprites
UserHacks_round_sprite_offset = 2 <= enable correction of all sprites (better on a couple of dump but not sure of the consequence)
I completely redo the algorithm. This time I do the projection and
interpolation of the 2 extrem vertex. This way I can compute the min/max
valid texture coordinate.
It gives stronger guarantee that texture sampling will be done inside the texture.
However it might have a performance impact, likely reasonable because it
is limited to sprite vertex.
A big thanks you to all people that provide me GS dump and test reports.
It is replacement of the previous hack (UserHacks_stretch_sprite). Don't enable both in the same time!
The idea of the hack is to move the sprite to the pixel boundary. It
avoids most of rounding issue. It also rescales verticaly the sprite (avoid horizontal line on ace combat).
I don't like this rescaling maybe we can limit it to only 1 pixels.
On my limited testcase, results are much better with any upscaling factor.
I still have a bad line in Kingdom heart. If you have issue with others
game please provide us a GS dump.
2x upscaling is pixel perfects. Bigger upscaling is better but not yet perfect
Feedbacks are welcomes (note it doesn't solve all upscaling issue, only wrong texture sampling)
For the history:
If you have a texture of [0;16[ texels and draws a primitive [0;16[
The formulae to sample last pixels of texture is
0.5 + (16*s-1)/(16*s) * 16
Native (s==1): 15.5 (good)
2x (s==2): 16 (bad, outside of the texture)
4x (s==4): 16.25 (bad, really outside of the texure))
+ This is not yet perfect. Really, this standard seems like a load of crap to me in fact...
At least it works now. Should test again when gcc 5 & new c++ libs gets out.. Until then, it will do.
On Linux, CPUs with AVX2 instruction sets that have TSX disabled (by
microcode update or otherwise) fail to build GSdx. The __RTM__ macro is
undefined, with leads to the TSX RTM instruction set (_xbegin, _xend,
_xabort, etc.) being unavailable.
Modify the preprocessor check so that the RTM instructions are only used
if available.
It conflicts with the global definition
I don't remember why this option was set on GSdx. Potentially it could be dropped (or fixed correctly)
Anyway, it will help to enable Address Sanitizer on Linux Build
+ Implement threads/mutex using std::thread/std::mutex/std::condition_variable
+ Old implementation can be used by commenting out the "define _STD_THREAD_ in GSThread.h
+ Commit should be much cleaner than previous one.
Note: of course it requires a glsl shader ;)
On windows, you can set the path on the ini file. Here an example with linux path:
shaderfx_conf = /home/gregory/playstation/emulateur/pcsx2_merge/bin/GSdx_FX_Settings.ini
shaderfx_glsl = /home/gregory/playstation/emulateur/pcsx2_merge/bin/shader.fx
I need to check carefully the consequence of ABI change. So far wx is very unhappy!
Fatal Error: Mismatch between the program and library build versions detected.
The library used 2.8 (no debug,Unicode,compiler with C++ ABI 1002,wx containers,compatible with 2.6),
and your program used 2.8 (no debug,Unicode,compiler with C++ ABI 1006,wx containers,compatible with 2.6).
Copy the full line into the pbo. Dma will only take GL_UNPACK_ROW_LENGTH
- increase memcpy size by 2 in the pbo
+ single memcpy will be faster and can use sse
Enable buffer_storage extension:
* GL_CLIENT_STORAGE_BIT was required (it is the duty of TexSubImage to copy data into the GPU mem)
* Enable the extension by default
Null is equivalent to a clear to 0.
Note: Code is not yet used because both stencil and depth are cleared.
Future note: stencil can potentially be replaced by load_store_image
* Don't use the stack
* Don't compare MinMax parameter (depends of others)
* Don't store not-compared parameter in the cache (HalfTexel/MinMax)
+0.3fps/46fps (well better than nothing)
* Only a single VAO
=> Format is set once
=> Only a single bind at startup
=> GSVertexBufferStateOGL is nearly useless
=> barely faster but better than nothing :)
A couple of fallbacks were introduced for the Mesa driver that only support 3.0
DSA will require a recent Mesa which already support GL3.3
Require at least SandyBridge for Intel GPU
* use DebugOutputToFile as a callback of gl error. Add a breakpoint to
find the culprit GL call
* use string instead of char[n]
Note: CheckDebugLog is potentially useless now
* GL_ARB_clip_control: reduce z fighting
* GL_ARB_clear_texture: no real speed gain (but improve code quality)
* GL_ARB_bindless_texture: +1fps (if you're CPU limited)
* require a 3.1 context
* unattach texture of the fbo when they're not used
(avoid to have a texture and depth_stencil with different size)
Note: except minor shader bug it works on Nvidia 340.23.01
This doesn't update the file to the latest version from mingw32 since this is already a custom header stripped from mingw32.
This also fixes the functions for x86_64(verified that it still works for both architectures) and also updates the version inside of GSdx.
Also removes a comment in the GSdx header saying that these functions are broken since they no longer are.
deassert gsopen_done before delete the gs object. Compiler and CPU will
reordonate the store (or not because it is a global variable). Probably not
perfect but better.
Got a crash on GSKeyEvent which I guess wasn't thread safe. So let's ensure
gs is open.
There is no such thing as MGS3 Substance only Subsistence. Substance is MGS2 :P
Also this new CRC is US Disc 2 of Subsistence. The blue stripes should be fixed now I guess.
1/ initReadFifo will be first called on the GS thread
(openGL can only be done on the GS thread)
2/ readFifo will be called on the EE thread. It is not safe too access eeMem from GS
because of memory is virtual
Fix "recent" regression (crash) on Kingdom heart and others game too.
v2: add a len check on GSState::InitReadFIFO
* Fix to create a 3.0 GLES context
* set a default precision to fix most of shader compilation issue
* Crash later because of GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT
=> need to test opensource driver
Remains 3 files that I don't know the source
pcsx2/windows/DwmSetup.cpp: *No copyright* UNKNOWN
pcsx2/windows/SamplProf.cpp: *No copyright* UNKNOWN
pcsx2/windows/VCprojects/IopSif.cpp: *No copyright* UNKNOWN
Remains 1 files in common that I don't know the source
common/include/comptr.h: *No copyright* UNKNOWN
Remains too much files in plugins that I don't know the source :(
Add the --gles build option to the linux main script
Ifdef all gl code not supported on gles3 (note some will be reenabled for gles3.1)
Note: it probably doesn't run anymore. My Nvidia driver doesn't support
yet egl/gles so I can't test it. Feel free to contribute.
A bit slower. Maybe because SubData does the copy in the driver thread. My memcpy is done on
the main thread. I'm not sure it would worth an extra thread to copy vertex data to the GPU
Note: testers are welcome. You need to edit the ini file.
"ogl_vertex_storage=1" <= enable the extension
"ogl_vertex_storage=0" <= disable the extension
Again you need the support of GL_arb_buffer_storage (i.e. not catalyst)
Improve arb_buffer_storage implementation
Try harder to align the texture buffer
Strangely arb_buffer_storage is 3 times slower on my PC (nvidia)
Tester are welcome! Open the ini file
"ogl_texture_storage = 1" <= enable the extension
"ogl_texture_storage = 0" <= disable the extension
Note: you need an opengl 4.4 driver or one that support arb_buffer_storage (i.e. not catalyst)
This hack was used because GSReadFifo was called from the EE thread.
Previous commit move the call to the GSThread.
Hopefully avoid flushing the full GPU contex would improve openGL
performance (at least avoid some hiccups ;) )
Note: newer GSdx ogl won't be compatible with older PCSX2
Crashfix for a weird GIF_FLG_IMAGE2 situation in Wallace and Gromit Project Zoo. Needs further investigation.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5912 96395faa-99c1-11dd-bbfe-3dabce05a288
reduce_gl_requirement_for_free_driver => set it to 1 to only required a 3.0 context (you still required the good extensions)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5910 96395faa-99c1-11dd-bbfe-3dabce05a288
Reformatted fx files that were causing issues on certain text editors. They should now display correctly in those editors.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5897 96395faa-99c1-11dd-bbfe-3dabce05a288
Also removed the fallback recovery ps, and replaced the compile fail catch to a simple console print. Which I think is safer, and faster.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5894 96395faa-99c1-11dd-bbfe-3dabce05a288
UserHacks_AlphaStencil will take precedence on override_GL_ARB_shader_image_load_store
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5891 96395faa-99c1-11dd-bbfe-3dabce05a288
Note: DATE is used for shadows (persona 3) and others effect
You can disable it when you disable gl image extension (override_GL_ARB_shader_image_load_store = 0)
You can also enable it on AMD too (set it to 1 this time) but no guarantee.
If you feel the extension is too slow, you can try disable some gl barrier (aka "damn the torpedo full steam ahead!").
It can be done with the option UserHacks_DateGL4 = 1. Otherwise just disable the extension.
Note: don't enable UserHacks_AlphaStencil in the same time. GL_ARB_shader_image_load_store is an alternative implementation.
Enabling both in the same time will lead to undefined (well surely wrong) behavior.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5890 96395faa-99c1-11dd-bbfe-3dabce05a288
* properly detect gl nv depth extension
* Always show the hack on the gui. Add a new hack option for DATE (gl4.2) only
* Save the scan mode on linux too (f7)
* hopefully fix some crash on some drivers... (ensure aligment 256 bits alignment, and if not use std memcpy)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5888 96395faa-99c1-11dd-bbfe-3dabce05a288
Also disabled the gsdx AF options for the OGL renderer (because it's not implemented for that yet).
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5881 96395faa-99c1-11dd-bbfe-3dabce05a288
Fixes a small oversight on my part. Which was setting the maximum anisotropy to (0,1,2,3,4) respectively, instead of (1,2,4,8,16). So when you selected x16 for example, you were actually only getting x4 >.>
Corrected now, and you will be getting the full x16 when selected :)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5879 96395faa-99c1-11dd-bbfe-3dabce05a288
Adds anisotropic texture filtering (1x-16x) to the hardware settings. Enhances the visual quality of textures that are at oblique viewing angles.
Anisotropic filtering is automatically disabled if: 8-bit textures are enabled.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5878 96395faa-99c1-11dd-bbfe-3dabce05a288
As before: use the hotkey(f7) to switch between them. It will now save the chosen scanline effect, instead of always defaulting to disabled, on load.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5874 96395faa-99c1-11dd-bbfe-3dabce05a288
* gui refresh
+ Use some tab to reduce heigth for small screen
+ Add logz option
+ remove broken/experimental keyword. GSdx ogl is not too bad ;)
* autodetect GL_NV_depth_buffer_float
Linux tester you are welcome!
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5862 96395faa-99c1-11dd-bbfe-3dabce05a288
Best setting if you driver support GL_NV_depth_buffer_float => GL_NV_Depth = 1 & logz = 0
Otherwise => GL_NV_Depth = 0 & logz = 1
Explanation of the bug:
Dx z position ranges from 0.0f to 1.0f (FS ranges 0.0f to 1.0f)
GL z Position ranges from -1.0f to 1.0f (FS ranges 0.0f to 1.0f)
Why it sucks:
GS small depth value will be "mapped" to -1.0f. In others all small values will be 1.0f! Terrible lost
of accuraccy.
The GL_NV_depth_buffer_float extension allow to set the near plane as -1.0f.
So
"GL z Position ranges from -1.0f to 1.0f (FS ranges 0.0f to 1.0f)"
will become
"GL z Position ranges from -1.0f to 1.0f (FS ranges -1.0f to 1.0f)"
and therefore
"z posision [0.0f;1.0f] will map to FS [0.0f;1.0f]" as DX
Yes we just get back all precision lost previously :)
However you need hardware (intel?) and driver support (free driver?/gles?) :(
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5860 96395faa-99c1-11dd-bbfe-3dabce05a288
* use same path as game index db for cheats and cheats_ws
* install the new cheat zip file on cmake and debian installer
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5850 96395faa-99c1-11dd-bbfe-3dabce05a288
This fixes the crashes on F9 for me. We don't want to call GSgetTitleInfo2() when GSOpen() isn't finished yet.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5834 96395faa-99c1-11dd-bbfe-3dabce05a288
HTB123 on our forums came up with a hack that removes the broken main character shadow in hardware rendering
in SMT Nocturne.
This should work for all GSdx recognized versions / regions of the game (tested EU and NTSC-U).
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5819 96395faa-99c1-11dd-bbfe-3dabce05a288
* restore the old fxaa (Asmodeam will be integrated when I got time)
* port the recently added new scanline algo
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5818 96395faa-99c1-11dd-bbfe-3dabce05a288