Older Qualcomm drivers rotated the framebuffer 90 degrees and this fix didn't work.
Now for some obscene reason it rotates a full 180 degrees.
This can at least be worked around by flipping around the image on our end.
On Windows, nvidia don't give us their driver version, so we can't workaround any issues.
As buffer_storage is broken on some drivers, we wanted to disble it for them.
So we can't.
Luckyly only "some" released driver versions are affected as this extension is only available since some months. Let's hope that nobody have to use one of this driver version, else they will get a black screen ...
OSX has their own driver, so performance issues aren't shared with the nvidia driver (unlike the closed source linux and windows nvidia driver). So now they'll also use the MapAndSync backend like all other osx drivers.
fixes issue 6596
I've also cleaned up the if/else block selecting the best backend a bit.
The only two devices that do this are Mesa software rasterizer and Intel Ironlake(With a few hacks).
Basically since it doesn't support OpenGL 3.0, it can't grab the version the new way.
So failing that, it sets to GL 2.1, and continues.
Further along, on Ironlake at least, it tries grabbing the extensions the new GL 3.0 way and fails.
So have a fallback that grabs the extensions string the old way, in probably the most elegant way possible.
The old way was to use big switch/case statements based on a type of buffer.
The new one is to use inheritance.
This change prohibits us to change the buffer type while running, but I doubt we'll ever do so.
Performance should also be a bit better. Also a nice cleanup.
Added some comments about this different kind of buffers.
This is a bit slower on map_and_* because of flushing and _very_ much slower on buffer(sub)?data because of a new memcpy.
But this design allow us to decode directly into a gpu buffer, eg vertexloader will profit :)
gl.h and glext.h provide most of the function pointer typedefs and defines for extensions and core features.
The only one it doesn't provide is GL 1.1 function typedefs, but this is to be expected.
If anything needs defines or typedefs in their header in the future, that's as easy as before.
This one was introduced to reduce the glBindTexture and glActiveTexture calls. But it was quite a bit of logic and only an improvment on uploading/creating a texture, which is done rarely.
This branch is the final step of fully supporting both OpenGL and OpenGL ES in the same binary.
This of course only applies to EGL and won't work for GLX/AGL/WGL since they don't really support GL ES.
The changes here actually aren't too terrible, basically change every #ifdef USE_GLES to a runtime check.
This adds a DetectMode() function to the EGL context backend.
EGL will iterate through each of the configs and check for GL, GLES3_KHR, and GLES2 bits
After that it'll change the mode from _DETECT to whichever one is the best supported.
After that point we'll just create a context with the mode that was detected
We are used to render them out of order as long as everything else matches, but rendering order does matter, so we have to flush on primitive switch. This commit implements this flush.
Also as we flush on primitive switch, we don't have to create three different index buffers. All indices are now stored in one buffer.
This will slow down games which switch often primitive types (eg ztp), but it should be more accurate.
add the GL include (back) to Base.props
use a similar technique to GLX.cpp (by Sonic) in WGL.cpp to get
wglSwapIntervalEXT without the WGLEW check
Conflicts:
Source/Core/VideoBackends/OGL/OGL.vcxproj
Source/Core/VideoBackends/OGL/OGL.vcxproj.filters
Source/VSProps/Base.props
This "u32 components" is a list of flags which attributes of the vertex loader are present.
We are used to append this variable to lots of vertex generation functions, but some of them don't need it at all.
The usual way to handle this kind of request is to rise a flag which the gpu thread polls.
The gpu thread itself either generates the result or just write zeros if disabled.
After this, it rise another flag which says that this work is done.
So if disabled, we still have the cpu-gpu round trip time. This commit just returns 0 on the cpu thread
instead of playing ping pong...
fixes issue 6898
OpenGL defaults are GL_REPEAT, which is even more unlikely than GL_CLAMP_TO_EDGE.
As I can't test the behavoir of the real hardware, I changed it to how it works before,
but I guess just clip the texture makes more sense.
Real xfb didn't provide any read_stride, so there is a division by zero.
This commit calculates the correct read_stride for real_xfb, so there is also no hack for texture vs xfb needed.
This is done with a pixel buffer object. We still have to stall the GPU, but
we only do it once per efb2ram call.
As the cpu can't access the vram, it has to queue a memcpy for the gpu and
wait for the gpu to finish this copy. We did this for every cache line which
is just stupid. Now we copy the complete texture into a pbo and readback this
at once. So we don't have to wait for lots of round-trip-times.
Also use attributeless rendering. But we need the src rect, so set it by uniform.
If there is a slowdown here (I doubt as the driver likely has a fast path to update uniforms)
then we should check if this rect changes and only then update the uniform.
We use attributeless rendering, so officially we have to bind _any_ VAO.
As the state of this VAO doesn't matter, we don't have to switch it.
Also fix an AMD issue as they don't like to render from an empty VAO.
We neither scale nor render from subimages, so we by using gl_Position, we don't have to generate _any_ vertices for this converting.
Also remove the glTexSubImage optimization as every driver does it when needed. But there are some workflows (eg on APU) where it's better to realloc this texture instead of a second memcpy or stall.
This fixes Real XFB Jaggies in OpenGL on games which use yscaling, such
as most PAL games.
This fixes the last of the "Real XFB Macroblocking" issues for opengl,
see issue #6503
Seems OpenGL ES 3 Requires you must have an lod argument, while Desktop
versions require you must not have a lod argument if you are using a
Sampler2DRect (which doesn't do Mipmapping).