Accurate options do a better jobs. Technically it can still
be useful for old gpu/driver that doesn't support the GL4.5 extension.
On Windows, you can still rely on Dx
On linux, free driver support it (except Intel)
Code was completey bitrotten
Code was a partial test (and yet 500 lines already)
Shader is more and more complex and multithreading support greatly
reduce the cost of shader switch
Nvidia allows to get the ASM of the shader of the compiled shader. It is useful
to check the performance.
It also allow me to compile most of shader code path for QA
Dump is enabled in linux replayer + debug_glsl_shader = 2
It might save a couple of fps
Add a define to test the perf if we keep only the blue channel. It brokes
the code in Prince Of Persia that use the Red/Green channel... Maybe the
speed hack :( Or find a way to replace all if with a lookup table
Note: it is only supported on OpenGL currently
It might help to fix a bit the color on a couple of games
accurate_fbmask = 1
Code uses GL4.5 extensions. So far it seems the effect is ony used a couple
of time and often in non-overlapping primitive. Speed impact will likely remain small
GS doesn't supports texture shuffle/swizzle so it is emulated in a
complex way.
The idea is to read/write the 32 bits color format as a 16 bit format.
This way, RG (16 lsb bits) or BA (16 msb bits) can be read or written with
square texture that targets pixels 1-8 or pixels 8-16.
However shuffle is limited. For example you can copy the green channel
to either the alpha channel or another green channel.
Note: Partial masking of channel is not yet implemented
V2: improve logging
V3: better support of green channel in shader
V4: improve detection of destination (issue due to rounding)
The purpose of the code is to support alpha channel
of RT uses as an index for a palette texure.
I'm afraid that code will likely break pure palette texture. Only used
if paltex is enabled
It fixes missing shadow in Star Ocean 3 (issue #374) in Native resolution
with filter = 0 (no filtering) or = 2 (normal fitering)
Rendering explanation:
The game emulates a stencil buffer with the alpha channel
The alpha channel of the RT can contains a palette texture index (format 4HH)
The idea is to have a gradient of value in the palette (16/32/48/...).
This way you can implement a +16/-16 and even wrap the alpha value every time
you hit the pixel.
Bilinear filtering breaks the rendering because it interpolates between counts
so you doesn't have the exact count
Upscaling breaks the rendering because the RT is reused as an input texture. It means
that we need to scale it down which again create some interpolations.
This way it will allow to implement all blendings operartion in FS.
Of course it will be slow, but it would be nice for debug and quickly check
game error rendering.
Currently we're trying to infer the conversion shader based on the output format
It only works if the input data is RGBA8. It might not be true in the future
Initially we copy pitch by line in the PBO and tell the dma to only
use the first valid byte.
Now, we only copy useful data to the PBO. It reduce the copy and PBO memory requirement.
It seems a bit faster on native resolution