* some preliminary work to test/benchmark bindless texture in the future (glsl was not yet updated)
Bindless texture allow to get a GPU texture pointer and then set it directly
to the shader as a basic uniform.
=> no more texture unit selection/validation
=> no more texture validation neither texture hash lookup
3rdparty: update gl header to the latest gl4.4
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5720 96395faa-99c1-11dd-bbfe-3dabce05a288
Card that support gs:
remain only a gs to generate sprite from a line. Even dummy gs are costly for the GPU.
Card that don't support gs:
remove useless copy of color for line and triangle primitives
Note for dx: opengl 3.2 (maybe not gles) supports both flat interpolation
convention (GL_FIRST_VERTEX_CONVENTION or GL_LAST_VERTEX_CONVENTION). It might
be possible to shuffle vertex index to put the last vertex in first position.
- buff[0] = head + 0;
- buff[1] = head + 1;
- buff[2] = head + 2;
+ buff[0] = head + 2;
+ buff[1] = head + 1;
+ buff[2] = head + 0;
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5718 96395faa-99c1-11dd-bbfe-3dabce05a288
The idea was to replace shader program swith by pointer function calls inside
shaders. At least parameters that are often changed between draw call. So far
I only ported atst and colclip. Unfortunately code is "slower" (on GSdx standalone).
For the moment keep the code but disabled.
If I understand well the validation of program is done in the "driver thread"
but the additional call are done in the overloaded MTGS thread. Apitrace
profiling shows faster GPU draw calls. Another possibility is that the driver still
need to validate the draw call because of others state change.
Here some stats on colin3 (90 frames):
without subroutine: UseProgram 125246
with subroutine: UseProgram 2906, subroutine 125945 => 3605 extra calls overhead (not
all parameters are ported to subroutine)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5715 96395faa-99c1-11dd-bbfe-3dabce05a288
* preliminary work for GL4.4 extensions (ARB_clear_texture & ARB_multi_bind). Disabled until I got a 4.4 driver
Note: I plan also to use ARB_buffer_storage
* compute texture gl option in the constructor (avoid a couple of swith case)
* redo texture unit management. Unit 0-2 for shaders, Unit 3 for texture operations. MultiBind will allow to bind
shader input without disturbing texture binding points.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5711 96395faa-99c1-11dd-bbfe-3dabce05a288
* add a non-working hack: UserHacks_DateGL4, goal was to replace UserHacks_AlphaStencil
+ Detection of good/bad samples is based on primitive ID variable. However I'm not sure
the behavior is always the same between draw call...Anyway let's keep a copy of the current
work
* Dump integer texture into text csv
* add gl4.2 ARB_shader_image_load_store extension (needed by UserHacks_DateGL4)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5707 96395faa-99c1-11dd-bbfe-3dabce05a288
* use gles header file, disable opengl code (mainly GLX, ARB_sso, geometry shader)
* Define properly the function pointer, GLES use basic linking whereas GL must get the symbol dynamically
* cmake: properly search and set libglesv2.so
* don't use dual source blending => HW renderer work (only miss unimportant FBA)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5701 96395faa-99c1-11dd-bbfe-3dabce05a288
* uniform was wrongly set before the activation of the program (free driver only)
* Always use only 1 drawbuffer. Easier besides previous setup was wrong for convert:ps_main1
GLES trial (v3):
* add the cmake option GLES_API. Note library (libgles) are hardcoded for the moment
* Disable opengl check
* Disable gl_GetDebugMessageLogARB not supported!
* Emulate gl_DrawElementsBaseVertex, add manually the index offset (surely slow but work)
* Fix hundred of shader error (no implicit cast of integer to float...)
Unfortunately GLES doesn't support dual blending so no blend in hardware renderer. Otherwise it is fine
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5700 96395faa-99c1-11dd-bbfe-3dabce05a288
* replace vertex interface with block interface. It avoid to depends on the ARB_sso extension.
* disable geometry shader on Nvidia & Linux. Slower but better than a black screen !
* default logz to 1, avoid some glitches.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5699 96395faa-99c1-11dd-bbfe-3dabce05a288
* redo glsl2h.pl script to generate only one big glsl headers.
* fix gcc warning in GSVector.h
* fix memory leak of GSDeviceOGL.m_shader
* clean shader compilation function => split generation header & drop malloc stuff
cmake:
* only rebuild shader when asked by the use. Avoid perl dependency to build pcsx2
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5698 96395faa-99c1-11dd-bbfe-3dabce05a288
* try to setup advance gl context attribute. If driver doesn't support it fallback to the default.
zzogl:
* use glsl2h to generate an header shader as GSdx. Much easier to install
* Get GSUniformBufferOGL & GSVertexArrayOGL speed improvement from GSdx
cmake:
* detect current gcc version. Yield a warning if < 4.7 or an error if < 4.5
glsl2h:
* support zzogl too
* compute md5sum to avoid useless relink
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5696 96395faa-99c1-11dd-bbfe-3dabce05a288
* fix shader compilation on Nvidia (broken on r5682)
* fix various memory leak thanks to Valgrind
Cmake: fix a minor typo
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5688 96395faa-99c1-11dd-bbfe-3dabce05a288
* Separate state and shader compilation into separate function
* replace various hash_map by basic array
* Compact VertexScale and offset into a single vec4
* add the new option "ogl_vertex_subdata": subdata is faster on FGLRX, test are welcome on Nvidia drivers
0 => use map/unmap
1 => use subdata
replay: add "linux_replay" option and compute some nice stat (mean, standard deviation)
cmake: recreate shader header at build time
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5682 96395faa-99c1-11dd-bbfe-3dabce05a288
* Don't write color during stencil. Keep the old method to ease debug
* Realign date shader on DX
* keep stencil ref to 1. Reduce stencil management burder
* Fix texture pitch (was in pixels but need bytes). Fix Bleach Blade Battlers 2. Thanks Miseru for your trace
Remember note:
fxaa -> not yet implemented
msaa -> not implemented (I might drop it actually)
8 bit texture -> not yet implemented
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5657 96395faa-99c1-11dd-bbfe-3dabce05a288
* Fix shader alpha issue. Hoppefully fix lots of glitches.
* Use glBufferSubData to upload data into the constant buffer instead of map/unmap. More fps :)
* Fix wrong api setup on EGL
* Be more portable on glsl2h
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5650 96395faa-99c1-11dd-bbfe-3dabce05a288
* clean texture management & use different texture unit for various texture operation
* separate sampler and texture setup
* Always disable GL_SCISSOR_TEST before any cleaning to be sure the all pixels are reset
* properly set upscale_multiplier in the linux gui (was mixed with msaa). Unfortunately it is still broken...
* Fix EGL with GSopen1
* Use stdcall for function definition in the replay (avoid segmentation fault)
* Directly use gl_Position in shader instead of an extra user defined variable
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5648 96395faa-99c1-11dd-bbfe-3dabce05a288
* Emulate Geometry Shader from the CPU.
* add some option to override opengl extension detection
* redo shader interface (again) to compile on the free driver
SW renderer is now working on the free driver.
To test it on your linux box use this cmake option -DEGL_API=TRUE
Note: (need opengl 3.0) I test mesa 9.2 git
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5646 96395faa-99c1-11dd-bbfe-3dabce05a288
* fix a bad interaction when GL_ARB_SSO is supported without GL_ARB_shading_language_420pack
* try to replace struct with flat parameter in glsl interface
=> with some hacks of the free driver, I was able to compile SW renderer shader. Unfortunately
the rendering is broken. Maybe my hack :p
* remove EGL context check (wasn't working as expected)
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5644 96395faa-99c1-11dd-bbfe-3dabce05a288
* fix wrong interaction when both GL41 or GL42 aren't supported
* Add a new define to downgrade opengl requirement to test the free driver and EGL
=> As far as I can tell, they only miss geometry shader& GLSL150 (got GLSL140)
* Found a way to avoid crash in AMD driver. Skip the upload of small texture but rendering is wrong now...
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5643 96395faa-99c1-11dd-bbfe-3dabce05a288
* remove an old&useles dummy geometry shader (was used to workaround amd bug)
* use a basic X11 window for GSopen1
* redo the replayer: it is now based on dlopen rather than standard static/dynamic library. AMD driver doesn't play nicely with the later...
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5639 96395faa-99c1-11dd-bbfe-3dabce05a288
* replace both DISABLE_GL41_SSO and DISABLE_GL42 macro with a dynamic check based on the extension support
* glsl2h: use static instead of extern
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl-wnd@5636 96395faa-99c1-11dd-bbfe-3dabce05a288
* add a perl script to convert shader to char*
* By default use *.glsl file (handy to do some trials).
Only drawback, glsl2h need to be manually called at the moment
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5632 96395faa-99c1-11dd-bbfe-3dabce05a288
* Use bigger index for Uniform buffer to avoid any collision with sampler.
* add a new config to disable openGL 4.2 requirement. Would be done at runtime later.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5620 96395faa-99c1-11dd-bbfe-3dabce05a288
* port KrossX patch from r5556 to openGL
* add a basic gui entry, would love an additional description
* also add the pointsampler hack but don't activate it yet
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5570 96395faa-99c1-11dd-bbfe-3dabce05a288
* Don't try to allocate a "null" array when shader log is empty
* glsl: hopefully implicit cast error on Nvidia driver
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5542 96395faa-99c1-11dd-bbfe-3dabce05a288
* add some dummy shader. Can be modify inside the debugger apitrace
* glclear* commands` seem to depend on scissor test and depth mask. Allow full write for the depth buffer
* texture debug, try to output some nice colors for the depth buffers
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5365 96395faa-99c1-11dd-bbfe-3dabce05a288
This may make gsdx slightly slower for everyone (I don't know an easy way to restrict this to affected systems), especially if using 8-bit textures.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5341 96395faa-99c1-11dd-bbfe-3dabce05a288
So, in the end I only properly understood the old code after finding all the problems with my version. I'm not sure whether any changes I've made are improvements any more, I'll need to review it with what I've learned in mind. This effort might've been a big waste of time.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5329 96395faa-99c1-11dd-bbfe-3dabce05a288
gsdx:
* add some parenthesis to shup up very verbose gcc warning
* adapt ogl to latest sudonim change
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5290 96395faa-99c1-11dd-bbfe-3dabce05a288
* properly delete program and vertex array. Avoid a crash on plugin reload
* reset shader state. Avoid to reuse invalid data on plugin reload
gsdx:
* add an hack to unattach/attach the gl context from different thread. Help to solve some crashes. The best will be to move gpu operation out of gsreadfifo but it would need more works
* implement logz for test purpose (don't seem to help)
gsdx replay:
* use default xdg location
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5289 96395faa-99c1-11dd-bbfe-3dabce05a288
Explanation, because this gives me a headache and this might save someone else one (or I might be wrong and they might see why): in D3D10, 0.0 points to the centre of the leftmost texel and 1.0 points one texel to the right of the rightmost texel, so to map a UNORM uniformly across a texel we need to multiply the input by (w-1)/w. In D3D9 0.0 points to the left edge of the leftmost texel and 1.0 to the right edge of the rightmost texel so after the multiplication we add 1/2w.
Actual texture sampling is probably not right for at least one of D3D9 and D3D10, but this headache is killing me.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5279 96395faa-99c1-11dd-bbfe-3dabce05a288
Another refinement to the Wild Arms hack by KrossX.
The hack now only applies to one kind of geometry (sent using the unpacked UV handler).
This works nicer in Wild Arms as it fixes "jumpy" characters.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5124 96395faa-99c1-11dd-bbfe-3dabce05a288
Adding KrossX's Wild Arms text alignment hack to the new dialog box. This hack is actually very interesting for a number of games. It should work well in cases where game designers adjusted everything pixel perfect for the GS, that usually breaks with upscaling.
It should be generalized and renamed later.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5120 96395faa-99c1-11dd-bbfe-3dabce05a288
Committing a hack KrossX prepared (thanks) ;)
It can be used to fix bad character sprites in Gust games.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@5101 96395faa-99c1-11dd-bbfe-3dabce05a288
* Use the new map interface/separate texture coordinate inside shader
* support new format on texture
Note: it is quite instable with various crashes and GL error but at least it compiles now :p
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@5094 96395faa-99c1-11dd-bbfe-3dabce05a288
* implement some missing shader for DATE, invert coordinate like strech rectangle
* Use glCopyImageSubDataNV nvidia extension to copy image (you need latest AMD drivers)
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@5086 96395faa-99c1-11dd-bbfe-3dabce05a288
* Fix some issue on wnd management, mix between sdl/ogl render
* create new gsdx option for ogl debug
+ debug_ogl_dump: start frame to dump when not 0
+ debug_ogl_dump_length: length of the dump
+ debug_ogl_shader: print shader debug
Note current dump option must be fixed to use linux path.
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@5067 96395faa-99c1-11dd-bbfe-3dabce05a288
* invert the index of fragment output. Seem to work better on Nvidia (strangely no impact for AMD)
* opengl support pitch too, so remove useless copy to an fbo
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@5034 96395faa-99c1-11dd-bbfe-3dabce05a288
* Fix the Geomtry shader to output 2 triangles for quad primitive (ie 2R rendering)
- There is an AMD driver bug on geomtry shader input interface (well could be the spec too). Tell me if it still working on nvidia
* Add a workaroung to a previous AMD bug. It is impossible to unattach a shader so destroy the full shader pipeline...
* Be more strict on FBO management. Would optimize it later
* use a texture insted of a render buffer for depth-stencil management.
* add more dumping capabilities (in particular depth buffer)
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@5033 96395faa-99c1-11dd-bbfe-3dabce05a288
* It works with the HW renderer !!!
* Not sure this fix will work with dual blending but we will see later
* screen is vertically flipped again...
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@5031 96395faa-99c1-11dd-bbfe-3dabce05a288
* fix vertex shader for HW renderer :) Remains the fragment part...
* add some dumping infrastucture (DUMP_START and DUMP_LENGTH)
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@5030 96395faa-99c1-11dd-bbfe-3dabce05a288
* go back to opengl 4.1 (nvidia driver is buggy with 4.1).
* fix the backbuffer allocation. bad order of parameter
* fix remaining glsl error
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@5006 96395faa-99c1-11dd-bbfe-3dabce05a288
* rework a little the shader to be hopefully compatible with nvidia
* request a pure 4.2 context (all gpu 4.1 capable support 4.2 anyway)
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@5003 96395faa-99c1-11dd-bbfe-3dabce05a288
* Enable interlace feature. (note I'm well aware that interlace crashes with SDL)
* remove useless callback debug function
* handle the y axis differently. Move vertex to follow right-hand system (render everything in reverse) then flip the y-axis for the screen rendering
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@4987 96395faa-99c1-11dd-bbfe-3dabce05a288
* add a callback for GLERROR. Allow to breakpoint on GSDeviceOGL::DebugCallback (gdb is completely lost on amd driver, hope it is better on nvidia)
* Add some empty glsl convert to shutup some useless debugging error
* request an advance opengl context without pre 3.0 feature.
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@4983 96395faa-99c1-11dd-bbfe-3dabce05a288
* flip y-axis in merge stage
* default to xlib window managment (the dynamic switch between sdl and xlib crashi but SDL will probably dropped later)
* improve management of FBO, draw buffer
* try to fix some issue with glClearBuffer but spec is not clear.
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@4982 96395faa-99c1-11dd-bbfe-3dabce05a288
* implement the saving of texture (take bmp SW code)
* fix the missing "enable attribute code" and the typo in glsl. It works now !!!
* rework a little texture to pack texture into a temporay buffer when src pitch != dst pitch
* try to replace sdl with pure xlib (not yet enabled by default but it seems to work)
Note there still a minor issue, coordinate are different between DX and OGL (upper-left vs lower-left) so the image is inversed.
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@4981 96395faa-99c1-11dd-bbfe-3dabce05a288
* implement shader code (subset that seem to be used in SW mode)
* some hack to fix some alignment issue with GSVertexPT1 structure
* Lots of various opengl fixes. Remaining unsupported GL feature are inside SDL.
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@4974 96395faa-99c1-11dd-bbfe-3dabce05a288
* make a rough implementation of most of the opengl device interface. Only a savestate nothing to expect yet.
* depend of libglew 1.6 (normally 1.7 but I manually defined the only missing function)
git-svn-id: http://pcsx2.googlecode.com/svn/branches/gsdx-ogl@4971 96395faa-99c1-11dd-bbfe-3dabce05a288
Skygunner crashing on boot.
James Bond 007: Everything or Nothing doing a huge Vram usage when opening the weapons screen and making the system crawl at it. Couldn't test much with this one and only added the US version for now.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@4921 96395faa-99c1-11dd-bbfe-3dabce05a288
32-bit depth buffers for D3D9 users if available. Lots of code shuffling for reasons I don't even remember. Stuff. Pretty much just the 32-bit depth buffers. That's good though, you don't have to envy D3D10 users half as much now.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@3002 96395faa-99c1-11dd-bbfe-3dabce05a288
GSDx: Removed discards from partial colclamp support as it wasn't doing much good and definitely won't be necessary with the next stage of support. No significant functional change probably.
As before, please do a full rebuild of gsdx. I hate it as much as you but don't know how to make VS smarter about this.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@2712 96395faa-99c1-11dd-bbfe-3dabce05a288
Fixes shadows in Ico and Shadow of the Colossus and hopefully fixes more effects in other games.
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@2702 96395faa-99c1-11dd-bbfe-3dabce05a288
- Automatic texture filtering should be ok now, occasionally point filtering was used. Tested it on the ps2 and figured with no mip levels LoD and minification settings are just ignored altogether.
- Also run a few tests on the gather instruction with the reference rasterizer and found a fatal flaw with it. It returns the four samples for bilinear sampling (in a funny order, which isn't documented of course, x = bl, y = br, z = tr, w = tl), but there is no way to guess which four were selected exactly. Due to some hidden rounding error it might grab different texels than I would when calculating the position of the upper-left texel, of which the fractional part is be used for the interpolation. When the texel positions do not match it leaves annoying discontinuity errors. Oh well...
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@1571 96395faa-99c1-11dd-bbfe-3dabce05a288
- trying the dx10.1-only "gather" shader instruction for palletized lookups ("8-bit texture" mode), saves 4 instructions which isn't much but still... (not tested, don't have ati)
- may fix the intel gma "no output" bug (don't have gma either :P)
- and the usual small code optimizations
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@1549 96395faa-99c1-11dd-bbfe-3dabce05a288
And there is even an explanation.
The tfx functions calculate At * Af >> 7, which means modulating by 0x80 should return At as the result.
With the evil floating point pixel shader however 0x80 translates to 128/255 (0.502), not exactly 0.5, modulation as At' * Af' * 2 (' means 0 - 1.0 range) is not the same as with integers.
At' = Af' = 0.502
At' * Af' * 2 = 0.504
If the alpha test happens to be "not equal to 0x80", then abs(0.504 - 128/255) < 0.5/255 will just miss.
Solution is to re-scale those values to the integer range, do the calculations, and then back to float again, but in the end it just simplifies down to At' * Af' * 255/128, doh...
At * Af >> 7 => ((At' * 255) * (Af' * 255) / 128) / 255 => At' * Af' * 255/128
At' = Af' = 0.502
0.502 * 0.502 * 255/128 = 0.502 (w00t!)
git-svn-id: http://pcsx2.googlecode.com/svn/trunk@1272 96395faa-99c1-11dd-bbfe-3dabce05a288