This commit should have zero performance effect if SSBOs are supported.
If they aren't (e.g. on all Macs), this commit alters FramebufferManager
to attach a new stencil buffer and VertexManager to draw to it when
bounding box is active. `BBoxRead` gets the pixel data from the buffer
and dumbly loops through it to find the bounding box.
This patch can run Paper Mario: The Thousand-Year Door at almost full
speed (50–60 FPS) without Dual-Core enabled for all common bounding
box-using actions I tested (going through pipes, Plane Mode, Paper
Mode, Prof. Frankly's gate, combat, walking around the overworld, etc.)
on my computer (macOS 10.12.3, 2.8 GHz Intel Core i7, 16 GB 1600 MHz
DDR3, and Intel Iris 1536 MB).
A few more demanding scenes (e.g. the self-building bridge on the way
to Petalburg) slow to ~15% of their speed without this patch (though
they don't run quite at full speed even on master). The slowdown is
caused almost solely by `glReadPixels` in `OGL::BoundingBox::Get`.
Other implementation ideas:
- Use a stencil buffer that's separate from the depth buffer. This would
require ARB_texture_stencil8 / OpenGL 4.4, which isn't available on
macOS.
- Use `glGetTexImage` instead of `glReadPixels`. This is ~5 FPS slower
on my computer, presumably because it has to transfer the entire
combined depth-stencil buffer instead of only the stencil data.
Getting only stencil data from `glGetTexImage` requires
ARB_texture_stencil8 / OpenGL 4.4, which (again) is not available on
macOS.
- Don't use a PBO, and use `glReadPixels` synchronously. This has no
visible performance effect on my computer, and is theoretically
slower.
This stops the virtual method call from within the Renderer constructor.
The initialization here for GL had to be moved to VideoBackend, as the
Renderer constructor will not have been executed before the value is
required.
This moves all the byte swapping utilities into a header named Swap.h.
A dedicated header is much more preferable here due to the size of the
code itself. In general usage throughout the codebase, CommonFuncs.h was
generally only included for these functions anyway. These being in their
own header avoids dumping the lesser used utilities into scope. As well
as providing a localized area for more utilities related to byte
swapping in the future (should they be needed). This also makes it nicer
to identify which files depend on the byte swapping utilities in
particular.
Since this is a completely new header, moving the code uncovered a few
indirect includes, as well as making some other inclusions unnecessary.
Before #4581, an invocation of `SetBlendMode` could invoke
`glBlendEquationSeparate` and `glBlendFuncSeparate` even when it was
setting `glDisable(GL_BLEND)`. I couldn't figure out how to map the old
behavior over to the new BlendingState code, so I changed it to always
call the two blend functions.
Fixes https://bugs.dolphin-emu.org/issues/10120 : "Sonic Adventure 2
Battle: graphics crash when loading first Dark level".
We (the Microsoft C++ team) use the dolphin project as part of our "Real world code" tests.
I noticed a few issues in windows specific code when building dolphin with the MSVC compiler
in its conformance mode (/permissive-). For more information on /permissive- see our blog
https://blogs.msdn.microsoft.com/vcblog/2016/11/16/permissive-switch/.
These changes are to address 3 different types of issues:
1) Use of qualified names in member declarations
struct A {
void A::f() { } // error C4596: illegal qualified name in member declaration
// remove redundant 'A::' to fix
};
2) Binding a non-const reference to a temporary
struct S{};
// If arg is in 'in' parameter, then it should be made const.
void func(S& arg){}
int main() {
//error C2664: 'void func(S &)': cannot convert argument 1 from 'S' to 'S &'
//note: A non-const reference may only be bound to an lvalue
func( S() );
//Work around this by creating a local, and using it to call the function
S s;
func( s );
}
3) Add missing #include <intrin.h>
Because of the workaround you are using in the code you will need to include
this. This is because of changes in the libraries and not /permissive-
Keeps associated data together. It also eliminates the possibility of out
parameters not being initialized properly. For example, consider the
following example:
-- some FramebufferManager implementation --
void FBMgrImpl::GetTargetSize(u32* width, u32* height) override
{
// Do nothing
}
-- somewhere else where the function is used --
u32 width, height;
framebuffer_manager_instance->GetTargetSize(&width, &height);
if (texture_width != width) <-- Uninitialized variable usage
{
...
}
It makes it much more obvious to spot any initialization issues, because
it requires something to be returned, as opposed to allowing an
implementation to just not do anything.
Hopefully will fix the crash in vkDestroyInstance on the NV Shield TV,
and likely reduce boot times slightly for drivers that take a while
to create instances.
This happened when the geometry shader was disabled, and the uniform
buffer was grown to a larger size. The update would be skipped, leaving
the old buffer to be included in the descriptor set.
This also moves the pipeline and descriptor set layouts used for texture
conversion (texel buffers) to ObjectCache, and shares a binding location
with the SSBO set.
Should we ever introduce anything else that has to be done when a command
buffer is executed (e.g. invalidating constants from previous commit), we
don't have to update all the callers.
This could happen because the vertex memory was already committed, if a
uniform buffer allocation failed and caused a command buffer to be
executed, it would be associated with the previous command buffer rather
than the buffer containing the draw that consumed these vertices.
This was happening when a fence wait happened mid-frame. The data written
between the fence being queued and the allocation occuring was incorrectly
assumed to be consumed by the GPU.
Making changes to ConfigManager.h has always been a pain, because
it means rebuilding half of Dolphin, since a lot of files depend on
and include this header.
However, it turns out some includes are unnecessary. This commit
removes ConfigManager includes from files which don't contain
SConfig or GPUDeterminismMode or GPU_DETERMINISM (which means the
ConfigManager include is not used).
(I've also had to get rid of some indirect includes.)
Instead of resetting two command buffers, now we only have to call
vkResetCommandPool once at the start of a frame.
NV's recommends using one pool per frame/thread. May offer a very small
boost in performance on some systems.
anv seems to set this to zero, which is fine according to the spec, but
we were using it as a maximum, which was resulting in a swap chain
without any buffers being created.
This fixes the screenshot stutter, as this needs more than a frame.
So we won't stall on the png writing at all until emulation stops or
a new screenshot is requested.
There's not a lot of point in passing these around or storing them
(texture cache/state tracker mainly) as there will only ever be a single
instance of the class.
Also adds downcast helpers such as Vulkan::Renderer::GetInstance().