Avoid waiting for earlier submissions when we flush more often.
The vertex manager will flush more often if the game accesses the EFB
on the CPU, to give the GPU a head start.
Texture dumping can already be done using VideoCommon's system (and in fact the same setting already enabled *both* of these). Dumping objects/TEV stages/texture fetches doesn't currently have an equivalent, but could be added to the FIFO player instead.
Not doing so produces a warning in clang:
ISO C++20 considers use of overloaded operator '!=' (with operand types
'Metal::DepthStencilSelector' and 'Metal::DepthStencilSelector') to be
ambiguous despite there being a unique best viable function with
non-reversed arguments
The underlying reason for this warning is an incorrect method signature.
It stores both the konst selection value for alpha and color channels (for two tev stages per ksel), and half of a swap table row (there are 4 total swap tables, which can be used for swizzling the rasterized color and the texture color, and indices selecting which tables to use are stored per tev stage in the alpha combiner). Since these are indexed very differently, the old code was hard to follow.
The masking was incorrect. This affects the main menu of The Last Avatar, though that menu also relies on copy filter functionality that is not correctly handled in the software renderer so the difference is not obvious; that game shuffles textures across all indices for some reason, so this issue would presumably result in subtle flickering.
Per https://en.cppreference.com/w/cpp/preprocessor/replace#.23_and_.23.23_operators the `##` behavior is a nonstandard extension; this extension seems to be supported by all compilers we care about, but IntelliSense in visual studio doesn't correctly handle it, resulting in false errors in the IDE (but not when compiling).
Per https://en.cppreference.com/w/cpp/preprocessor/replace#Function-like_macros C++20 introduced a workaround, where `__VA_OPT__(, )` generates a comma if and only if `__VA_ARGS__` is non-empty.
This PR replaces all occurrences, with the exception of Externals, DSPSpy (which is not likely to be edited in MSVC and does not target C++20 currently), and JitArm64_Integer.cpp (which uses `Function(__VA_ARGS__)`, and thus does not ever need a comma).
Adds a pass to process driver deficiencies between UID caching and use, allowing a full view of the whole pipeline, since some bugs/workarounds involve interactions between blend modes and the pixel shader
This increases accuracy, fixing the white rendering in Major Minor's Majestic March. However, the hardware backends can only have one viewport and scissor rectangle at a time, while sometimes multiple are needed to accurately emulate what is happening. If possible, this will need to be fixed later.
I think this is a relic of D3D9. D3D11 and D3D12 seem to work fine without it. Plus, ViewportCorrectionMatrix just didn't work correctly (at least with the viewports being generated by the new scissor code).
These aren't particularly useful, and make the code a bit more confusing. If for some reason someone wants to test what happens when these functions are disabled, it's easier to just edit the code that implements them. They aren't exposed in the UI, so one would need to restart Dolphin to do it anyways.
This message would be logged, usually multiple times, for EVERY. SINGLE. PIXEL. That's pretty much useless and just makes the log unreadable. Plus, the current support (which acts as RGB8) is close enough that for end-user purposes, it's fine. I don't think the hardware backends support RGB565_Z16 and its antialiasing functionality correctly either, but they don't have similar logspam.
It doesn't make sense for alpha to add the bias ONLY when dividing by 2, while color doesn't apply the bias for divide by 2 only; hardware testing indicates that alpha should have the bias.
This fixes the menus in Mario Kart Wii (https://bugs.dolphin-emu.org/issues/11909) but reintroduces the white rectangle in Fortune Street.
This reverts commit 5aaa5141ed (and several other matching changes elsewhere).
Fixes a crash that could occur if the static constructor function for
the MainSettings.cpp TU happened to run before the variables in
Common/Version.cpp are initialised. (This is known as the static
initialisation order fiasco.)
By using wrapper functions, those variables are now guaranteed to be
constructed on first use.
We don't use sampler2DMS, but we do use sampler2DMSArray.
I can't reproduce it on my phone, but a user who was running GLES
on a Tegra X1 reported a shader compilation error related to this.
The benefit to exposing this over the raw BP state is that adjustments Dolphin makes, such as LOD biases from arbitrary mipmap detection, will work properly.
The compiler was loudly announcing each and every branch Tev was not checking in
a switch statement, but Tev has learned it's lesson and will produce that
warning no more.
Currently the logic for addressing the individual TexUnits is splattered all
across dolphin's codebase, this commit attempts to consolidate it all into a
single place and formalise it using our new TexUnitAddress struct.
This fixes various texture offsetting issues with negative texture coordinates (bringing the software renderer in line with the hardware renderers). It also handles the invalid wrap mode accurately (as was done for the hardware renderers in the previous commit). Lastly, it handles wrapping with non-power-of-2 texture sizes in a hardware-accurate way (which is somewhat broken looking, as games aren't supposed to use wrapping with non-power-of-2 sizes); this has not been done for the hardware renderers.
SPDX standardizes how source code conveys its copyright and licensing
information. See https://spdx.github.io/spdx-spec/1-rationale/ . SPDX
tags are adopted in many large projects, including things like the Linux
kernel.
Running the min/max operation on the upside down, quad-rounded pixel
coordinates before inverting them to the standard upper-left origin
produces wrong results. Therefore, we need to do the inversion before
rounding to pixel quads.
We also need to ensure the the CPU does not receive stale values
which have been updated by the GPU. Apparently the buffer here
is not coherent on NVIDIA drivers. Not sure if this is a driver
bug/spec violation or not, one would think that
glGetBufferSubData() would invalidate any caches as needed, but
this path is only used on NVIDIA anyway, so it's fine. A point
to note is that according to ARB_debug_report, it's moved from
video to host memory, which would explain why it needs the
cache invalidate.
MAX_XFB_WIDTH/HEIGHT are the largest XFB sizes seen in practice, but do not make sense to use for the backbuffer size, which should be the size of the window. The old code created screenshots with a size of 720x540 on NTSC games when "Dump Frames at Internal Resolution" is unchecked; now, the window size is used.
When trying to do a small optimization in 8a0f5ea, I failed to
take into account that WeakFlush and FlushOne update m_query_count.
Only D3D11 and OGL had this problem, not D3D12 and Vulkan.
The STL has everything we need nowadays.
I have tried to not alter any behavior or semantics with this
change wherever possible. In particular, WriteLow and WriteHigh
in CommandProcessor retain the ability to accidentally undo
another thread's write to the upper half or lower half
respectively. If that should be fixed, it should be done in a
separate commit for clarity. One thing did change: The places
where we were using += on a volatile variable (not an atomic
operation) are now using fetch_add (actually an atomic operation).
Tested with single core and dual core on x86-64 and AArch64.
Previously we set the texture coordinate to zero, now we set
the texture coordinate *index* to zero. This fixes the ripple
effect of the Mario painting in Luigi's Mansion.
VideoCommon: Change the type of BPMemory.scissorOffset to 10bit signed: S32X10Y10
VideoBackends: Fix Software Clipper.PerspectiveDivide function, use BPMemory.scissorOffset instead of hard code 342
Fixes issue 11393.
The problem is that left and top make no sense for a width by height array; they only make sense in a larger array where from which a smaller part is extracted. Thus, the overall size of the array is provided to CopyRegion in addition to the sub-region. EncodeXFB already handles the extraction, so CopyRegion's only use there is to resize the image (and thus no sub-region is provided).
Additional changes:
- For TevStageCombiner's ColorCombiner and AlphaCombiner, op/comparison and scale/compare_mode have been split as there are different meanings and enums if bias is set to compare. (Shift has also been renamed to scale)
- In TexMode0, min_filter has been split into min_mip and min_filter.
- In TexImage1, image_type is now cache_manually_managed.
- The unused bit in GenMode is now exposed.
- LPSize's lineaspect is now named adjust_for_aspect_ratio.
Additionally, VCacheEnhance has been added to UVAT_group1. According to YAGCD, this field is always 1.
TVtxDesc also now has separate low and high fields whose hex values correspond with the proper registers, instead of having one 33-bit value. This change was made in a way that should be backwards-compatible.