Approximately three or four times now, the issue of pointers being
in an inconsistent state been an issue in the video backend renderers
with regards to tripping up other developers.
Global (ugh) resources are put into a unique_ptr and will always have a
well-defined state of being - null or not null
This was relying on behaviour that GLExtensions was adding fake extensions to the supported list with ES.
This no longer happens so it needed to be changed.
This was due to specifying negative source coordinates for the texture copy, which must lie within the bounds of the source and destination textures.
The behavior now is to clamp the copy region to [0 <= size <= backbuffer size], resulting in a copy region that can be smaller than the backbuffer, but never larger.
Since ResolveSubresource cannot be used with depth textures (and throws an error with the debug layer enabled), use a shader which selects the minimum depth value from all samples.
Changes the sampler by XFBEncoder to use a linear filter, rather than point, to match GL behavior.
The spec says that vendors can set the max texture size to be 65KB and we want 1MB.
Check the maximum supported and drop to the max if it is less than 1MB
They were only called at once, so no need to seperate them.
This also removes the only dereference of the NativeVertexFormat in VideoCommon, so backends may just return nullptr.
It was only implemented in OpenGL, though the option was visible in both
backends, leading to memory leaks if you enabled it in DirectX.
And it wasn't particularly useful as a debug feature as it only showed
where in the EFB the copies were taken from, not what format it was, or
what the copy was used for, or what content was in the EFB at that point
in time.
Also, it stretched the copy regions relative to the window, so the
on-screen regions don't even line up with the window unless the game used
the full EFB (some pal games) and you game image stretched to the full
window.
This removes OSD support for video software, but it was already broken before.
This commit does not try to fix coding style issues, the rewrite of this presentation API is splitted up.
Gets rid of magic numbers in cases where the array size is known at compile time.
This is also useful for future entries that are stack allocated arrays as these
functions prevent incorrect sizes being provided.
Removed Quality Levels from D3D AA options
Dropdown text now shows whether you're applying MSAA or SSAA
Added a description for SSAA
Moved SSAA checkbox
Cleaned up AA in backends slightly. Supported modes is now a list of ints.
Instead of blindly using the expected width, clamp it to the stride of the
buffer which dx11 returns. This prevents use from reading invalid memory
at the end of textures.
This doesn't solve the base issue of what to do when a game tries to copy
from outside the efb. On real hardware it returns random noise (biased
to all ones)
Instead of having special case code for efb2tex that ignores hashes,
the only diffence between efb2tex and efb2ram now is that efb2tex
writes zeros to the memory instead of actual texture data.
Though keep in mind, all efb2tex copies will have hashes of zero as
their hash.
Addded a few duplicated depth copy texture formats to the enum
in TextureDecoder.h. These texture formats were already implemented
in TextureCacheBase and the ogl/dx11 texture cache implementations.
SSAA relies on MSAA being active to work. We only supports 4x SSAA while in fact you can enable SSAA at any MSAA level.
I even managed to run 64xMSAA + SSAA on my Quadro which made some pretty sleek looking games. They were very cinematic though.
With this, it properly fixes up SSAA and MSAA support in GLES as well. Before they were broken when stereo rendering was enabled.
Now in GLES they can properly support MSAA and also stereo rendering with MSAA enabled(with proper extensions).
OpenGL ES 3.2 adds this feature to core
It was available to GLES 3.1 as GL_{EXT, OES}_texture_buffer as well.
For the non-Nvidia vendors that implemented this is:
- Qualcomm's Adreno 4xx
- IMGTec's PowerVR Rogue
Samsung updated the video drivers on the SGS6 which introduced a bug when disabling vsync.
Both the driver versions are r5p0, but the md5sums of the blob differ.
To work around the issue, make sure to never disable vsync by calling eglSwapInterval.
We can't actually determine the driver version on Android yet.
So until the driver version lands that displays the driver version string in the GL_VERSION string
we will need to keep this workaround enabled at all times, which is a bit annoying.
Current mali drivers return the video driver version in one of the EGL strings you can query.
The issue with that is that Android eats all of those strings, so we can't query it.
OpenGL ES 3.2 adds a few things we care about supporting in core. In particular:
- GL_{ARB,EXT,OES}_draw_elements_base_vertex
- KHR_Debug
- Sample Shading
- GL_{ARB,EXT,OES,NV}_copy_image
- Geometry shaders
- Geometry shader instancing (If they support GL_{EXT,OES}_geometry_point_size)
Nvidia was the first to release an OpenGL ES 3.2 driver which I uesd to test this on.
This also enables GS Instancing on GLES 3.1 hardware if it supports all of the required extensions.
Their new driver that supports GLES3.1 + AEP has issues with it.
At the very least they don't implement all of the geometry shader features fully which causes shader linker issues when we attempt to use them.
I don't have a device so I can't fully test, so until I do I'm going to blanket disable the whole thing.
When calculating the size of the undisplayed margin in the case where
fbWidth != fbStride for RealXFB for displaying in the output window,
we do not scale by IR - RealXFB is implicitly 1x.
The default EGL_RENDERABLE_TYPE is GLES1, so vendors have the ability to choose between returning only the bits requested, or all of the bits
supported in addition to the one requested.
PowerVR chose to take the route where they only return the bits requested, everyone else returns all of the bits supported.
Instead of letting the vendor have control of this, let's incrementally go through each renderable type and make sure it supports everything we want.
This will cover all devices for now, and for the future.
- Refactored Light Attenuation into inline function in Software Renderer
- Corrected zero length light direction vector to resolve with normal direction (essentially becomes LIGHTDIF_NONE which was what I was after)
- Change the API of this shared function to use points for output variables (degasus)
- Fixes remaining lighting issues (Mario Tennis, etc)
- Apply same fixes to Software Renderer
- Corrected zero length light direction vector to resolve with normal direction (essentially becomes LIGHTDIF_NONE which was what I was after)
The new implementation has 3 options:
SyncGpuMaxDistance
SyncGpuMinDistance
SyncGpuOverclock
The MaxDistance controlls how many CPU cycles the CPU is allowed to be in front
of the GPU. Too low values will slow down extremly, too high values are as
unsynchronized and half of the games will crash.
The -MinDistance (negative) set how many cycles the GPU is allowed to be in
front of the CPU. As we are used to emulate an infinitiv fast GPU, this may be
set to any high (negative) number.
The last parameter is to hack a faster (>1.0) or slower(<1.0) GPU. As we don't
emulate GPU timing very well (eg skip the timings of the pixel stage completely),
an overclock factor of ~0.5 is often much more accurate than 1.0
I tried to change messages that contained instructions for users,
while avoiding messages that are so technical that most users
wouldn't understand them even if they were in the right language.
We are used to use the texture parameter for all util draw calls,
but AMD seems to have a bug where they use the sampler parameter
of stage 0 if no sampler is bound to the used stage.
So as workaround (and a bit as nicer code), we now use sampler
objects everywhere.
- FileSearch is now just one function, and it converts the original glob
into a regex on all platforms rather than relying on native Windows
pattern matching on there and a complete hack elsewhere. It now
supports recursion out of the box rather than manually expanding
into a full list of directories in multiple call sites.
- This adds a GCC >= 4.9 dependency due to older versions having
outright broken <regex>. MSVC is fine with it.
- ScanDirectoryTree returns the parent entry rather than filling parts
of it in via reference. The count is now stored in the entry like it
was for subdirectories.
- .glsl file search is now done with DoFileSearch.
- IOCTLV_READ_DIR now uses ScanDirectoryTree directly and sorts the
results after replacements for better determinism.