Ported the code from RPCS3, with improvements made to the handling of control messages and audio transfers, Co-Authored with @mandar1jn
Missing new line chars
Co-Authored-By: mandar1jn <49076509+mandar1jn@users.noreply.github.com>
We've decided this track will never be used in the future. Releases will
continue using the "beta" branch internally, though we'll have the
user-visible strings use a different name instead.
(Note: Dolphin provided builds have always defaulted to 'beta' as the
auto-update track, so anyone who set 'stable' did so manually.)
This should be a fairly easy merge, assuming I didn’t mess anything up. TL:DR no one uses it and it’s not great.
Boot from DVD Backup is an ancient feature with origins in the Megacommit. Back then, GameCube and Wii games were quite large relative to drives of the time. For example, in 2008, the most common hard drive sizes were 320GB and 512GB. On the 320GB drive I personally had at the time, as little as 42 Wii ISOs could have filled it entirely! And that’s ignoring any other files one might want to put onto a drive. Backup DVDs allowed users to burn relatively cheap DVD media and store their GameCube and Wii dumps in a Dolphin accessible way that didn’t eat into their precious HDD space. It had compromises, even then, but in 2008… I mean honestly users probably wouldn’t even notice those compromises with how Dolphin barely even worked at all back then.
Obviously, today the storage space concerns are not as big of an issue. According to seagate the average hard drive it sells today is 8TB. For typical laptops purchased now, the -minimum- selection for storage is usually 1TB. You can even buy a name brand 4TB external hard drive for $100. GC and Wii ISOs are not as big as they once were, relatively anyway. Plus flash drives and SD cards are super cheap and way faster than disc drives ever were. For anyone that has limited drive space, removable flash media can fulfill this offloading role far better than backup DVD media ever could.
Also no one has DVD drives anymore. That’s kind of an important detail.
But to see if Booting from DVD Backup even still worked, I decided to give it a try. I have an ASUS BW-16D1HT, a badass Bluray XL reading and burning drive, connected to my Windows 11 Threadripper 5975WX machine. A super fast drive on a super fast machine is as good as it possibly can get for this feature. So I bought a spindle of DVD-Rs, burned a couple of discs and gave it a try. Surprisingly, it does still work. However, as expected, it introduces a lot of stuttering. Testing Prime 1 and Prime 3, in both games stuttering was introduced whenever the DVD Drive had to suddenly seek. Spikes of 50ms occurred constantly, but I observed 150ms and even over 1000ms stutters! The worst was a three second stutter, when loading Elysia in Prime 3. I could even hear the stutters - any time the drive suddenly made a harsh seeking noise, the game would have a hard stutter. It worked but, it has some serious compromises.
Boot from DVD Backup isn’t great, using removable flash media or external hard drives is a FAR better option for anyone with limited storage space today, and no one can even use this feature anymore because their computers don’t even have disc drives. It’s time for Boot from DVD Backup to go!
So I did my best on the cleanup but I’m bound to have left some bits. Especially in translation - I didn’t get any warnings or anything there that could help point me to where to clean that up. Please review!
See the comment added by this commit. We were previously guarding against
overshooting in address calculations, but not against undershooting.
Perhaps someone assumed that the displacement of an x86 loadstore was
treated as unsigned?
Note: While the comment says we can undershoot by up to 2 GiB, in
practice Jit64 as it currently behaves won't actually undershoot by more
than 0x8000 if my analysis is correct. But address space is cheap, so
let's guard the full 2 GiB.
This is the behavior in the x64 and ARM64 vertex loaders. I don't know if it makes sense (the whole skipped vertex system seems jank, but several games behave incorrectly without it).
This generated a warning on GCC about the operation being potentially undefined (-Wsequence-point). I'm not sure if that was actually the case, but either way it is a mistake.
Before, it was also compiled on ARM builds, but since it was unused it wasn't linked (and thus its dependency on the nonexistent x64Emitter didn't cause any link issues).
Fixes incorrect logspam when the buffer needed to be reset on flushes (which we already were doing, but 52feed04db moved it to after the check was made). This is https://bugs.dolphin-emu.org/issues/10312.
I also converted it to an assert, as if this does happen, things are going to render incorrectly, so we want to make it obvious.
This was added in #10394 for both the hardware and software backends to work around an issue with Mario Kart Wii, Fortune Street, and Baten Kaitos. However, it seems like the software renderer handles blending well enough that we don't need this (and in any case, it's easy to change blending in the software renderer).
Some experimentation with #11387 (not pushed) showed that the software renderer's logic would also produce correct results on the hardware backends with this hack removed, but would require fbfetch (currently); if a better solution is found the hack can also be removed from the hardware backends.
Otherwise, texelFetch() will use an out-of-bounds layer for game textures (that have 1 layer; EFB copies have 2 layers in stereoscopic 3D mode), which is undefined behavior (often resulting in a black image). The fast texture sampling path uses texture(), which always clamps (see https://www.khronos.org/opengl/wiki/Array_Texture#Access_in_shaders), so it was unaffected by this difference.
The former is deprecated and pretty much all modern drivers
support VK_EXT_debug_utils.
Android drivers dont support it. On those drivers,
we use the implementation provided by the validation layers.
Plus two miscellaneous debugger features that I found along the way when
reading Jit64's code for comparison: bJITNoBlockLinking and tracing.
Fixes https://bugs.dolphin-emu.org/issues/13127.
Small optimization. By not calling WriteExit, the block linking system
never finds out about the exit we're doing, saving us from having to
disable block linking.
We should expose Enable Controller Input and the turbo settings for
GBA just like we do for GameCube controllers and Wii Remotes.
I just forgot about it when implementing the GBA TAS input window.
Previously, if a user on Windows launched Dolphin from the command line
and specified a path to an M3U file and included backslashes in this path,
Dolphin would fail to resolve relative paths in the M3U file.
The calculation of each address in lmw/stmw currently has a dependency
on the calculation of the previous address. By removing this dependency,
the host CPU should be able to pipeline the loads/stores better. The cost
we pay for this is up to one extra register and one extra MOV instruction
per guest instruction, but often nothing.
Making EmitBackpatchRoutine support using any register as the address
register would let us get rid of the MOV, but I consider that to be too
big of a task to do in one go at the same time as this.
Now that we've flipped the C++20 switch, let's start making use of
the nice new <bit> header.
I'm planning on handling this move away from BitUtils.h incrementally
in a series of PRs. There may be a few functions remaining in
BitUtils.h by the end that C++20 doesn't have any equivalents for.
This reverts commit 351d095fff.
In hindsight, my attempted optimization messes with the return
predictor, unlike real tail calls. So I think it does more bad than
good.
The "vector shift by immediate" category encodes the shift amount for
right shifts as `size - amount`, whereas left shifts use `amount`.
We're not actually using SHRN/SHRN2 anywhere, which is why this has gone
undetected.
Use: callstack(0x80000000).
!callstack(value) works as a 'does not contain'.
Add strings to expr.h conditionals.
Use quotations: callstack("anim") to check symbols/name.
For quite some time now, we've had a setting on x86-64 that makes Dolphin
handle NaNs in a more accurate but slower way. There's only one game that
cares about this, Dragon Ball: Revenge of King Piccolo, and what that game
cares about more specifically is that the default NaN (or "generated NaN"
as I believe it's called in PowerPC documentation) is the same as on
PowerPC. On ARM, the default NaN is the same as on PowerPC, so for the
longest time we didn't need to do anything special to get Dragon Ball:
Revenge of King Piccolo working. However, in 93e636a I changed how we
handle FMA instructions in a way that resulted in the sign of NaNs
becoming inverted for nmadd/nmsub instructions, breaking the game.
To fix this, let's implement the AccurateNaNs setting, like on x86-64.
This affected the memory and registers widgets (and possibly others). I'm pretty sure it regressed in 5f629abd8b.
The SetCodeVisible line is a new fix, but the equivalent already existed in the memory widget.
The call to analyzer.Analyze breaks when it attempts to read an instruction, as it eventually tries to read memory when Memory::m_pRAM is nullptr. Trying to read when execution is not paused in general seems like a bad idea (especially as analyzer.Analyze uses PowerPC::TryReadInstruction which can update icache - this is probably still a problem).
Operations that have two operands and can't generate a default NaN,
i.e. addition and subtraction, already have the desired NaN handling
on x86. We just need to make sure to not reverse the operands.
This fixes ps_sum0/ps_sum1 outputting NaNs in cases where they shouldn't.
(HandleNaNs assumes that a NaN in a ps0 input always results in a NaN in
the ps0 output, and correspondingly for ps1.)
1. In some cases, ps_merge01 can be implemented using one instruction.
2. When we need two instructions for ps_merge01, it's best to start with
a MOV to avoid false dependencies on the destination register.
3. ps_merge10 can be implemented using a single EXT instruction.
This regressed in 0a906f553f, I think (though I haven't confirmed it). Mario Tennis and Luigi's Mansion both use these for some reason (as far as I can tell, the data isn't actually used; it's just extra data included for no reason)
DataReader is generally jank - it has a start and end pointer, but the end pointer is generally not used, and all of the vertex loaders mostly bypassed it anyways.
Wrapper code (the vertex loaer test, as well as Fifo.cpp and OpcodeDecoding.cpp) still uses it, as does the software vertex loader (which is not a subclass of VertexLoader). These can probably be eliminated later.
This new function is like MOVP2R, except it masks out the lower 12 bits,
returning them instead of writing them to the register. These lower
12 bits can then be used as an offset for LDR/STR. This lets us turn
ADRP+ADD+LDR sequences with a zero offset into ADRP+LDR sequences with
a non-zero offset, saving one instruction.
When emulated GBAs were added to Dolphin, it was possible to control them
using the GC TAS input window. (Z was mapped to Select.) Unaware of this,
I broke the functionality in b296248.
To make it possible to control emulated GBAs using TAS input again,
I'm adding a proper TAS input window for GBAs, with a real Select button
and no analog controls.
0e02ddcf52 removed separate logic for tiled versus non-tiled EFB peek caches, and as part of that made it so that color peeks updated the frame access mask even when a non-tiled cache is in use. However, the same change was not made for depth peeks. I'm not sure if this affected anything in practice.
`ImGui::GetIO` performs an assertion that a context exists, and if one doesn't then things will likely crash. Unfortunately this crash is hard to consistently reproduce.
I recently talked to a homebrew developer who was trying to add exception
handlers at link time but found out that Dolphin was overwriting their
exception handlers. I figure that's not the usual way to do exception
handlers, but... making us load the executable after setting up memory
rather than before is easy, and matches what we do when booting discs,
so I suppose there's no reason not to do it. It also matches the intent
of why Dolphin is writing default exception handlers – we're writing
them because some homebrew relies on exception handlers being left
around from whatever program was running before it (see 3dd777be70).
Let's take advantage of ARM64's input register shifting one last time,
shall we?
Before:
0x1280005b mov w27, #-0x3
0x1b1b7f18 mul w24, w24, w27
After:
0x4b180b18 sub w24, w24, w24, lsl #2
ARM64's flexible shifting of input registers also allows us to calculate
a negative power of two in one instruction; shift the input of a NEG
instruction.
Before:
0x128001f7 mov w23, #-0x10
0x1b1a7efa mul w26, w23, w26
0x93407f58 sxtw x24, w26
After:
0x4b1a13fa neg w26, w26, lsl #4
0x93407f58 sxtw x24, w26
If the destination register doesn't equal the input register, using it
to temporarily hold the immediate value is fair game as it'll be
overwritten with the result of the multiplication anyway. This can
slightly reduce register pressure.
Before:
0x52800659 mov w25, #0x32
0x1b197f5b mul w27, w26, w25
After:
0x5280065b mov w27, #0x32
0x1b1b7f5b mul w27, w26, w27
By taking advantage of ARM64's ability to shift an input register by any
amount, we can calculate multiplication by a number that is one more
than a power of two with a single instruction.
Before:
0x52800838 mov w24, #0x41
0x1b187f7b mul w27, w27, w24
After:
0x0b1b1b7b add w27, w27, w27, lsl #6
Turn multiplications by a power of two into bitshifts.
Before:
0x52800817 mov w23, #0x40
0x1b167ef6 mul w22, w23, w22
After:
0x531a66d6 lsl w22, w22, #6
Multiplication by one is also trivial. Depending on the registers
involved, either a single MOV or no instructions will be generated.
Before:
0x52800038 mov w24, #0x1
0x1b1a7f1b mul w27, w24, w26
After:
0x2a1a03fb mov w27, w26
Before:
0x52800039 mov w25, #0x1
0x1b1a7f3a mul w26, w25, w26
After:
Nothing!
Add a new function that will handle all the special cases regarding
multiplication. It does nothing for now, but will be expanded in
follow-up commits.
We can merge an SXTW with the SUB, eliminating one instruction. In
addition, it is no longer necessary to allocate a temporary register,
reducing register pressure.
Before:
0x93407f59 sxtw x25, w26
0x93407ebb sxtw x27, w21
0xcb1b033b sub x27, x25, x27
After:
0x93407f5b sxtw x27, w26
0xcb35c37b sub x27, x27, w21, sxtw
ARM64 can do perform various types of sign and zero extension on a
register value before using it. The Arm64Emitter already had support for
this, but it was kinda hidden away.
This commit exposes the functionality by making the ExtendSpecifier enum
available everywhere and adding a new ArithOption constructor.
[ VUID-VkDescriptorPoolCreateInfo-maxSets-00301 ] Object 0:
handle = 0x7f1,b8d,3cd,e70, type = VK_OBJECT_TYPE_DEVICE; |
MessageID = 0xa1,70e,236 | vkCreateDescriptorPool():
pCreateInfo->maxSets is not greater than 0.
The Vulkan spec states: maxSets must be greater than 0
BindFramebuffer depends on the pipeline which might not be set yet.
That's why the framebuffer dirty flag exists in the first place.
I assume BindFramebuffer was called directly here, in order to handle
the texture state transitions necessary for DiscardResource.
The state is tracked anyway, so we can just issue those transitions there
too and defer binding the actual framebuffer.
Fixes an issue in Zelda Twilight Princess with EFB depth peeks.
Dolphin would bind a frame buffer which doesn't have an integer format
descriptor for the color target before binding the new pipeline.
So it would accidentally use the 0 descriptor.
Debug layer error:
D3D12 ERROR: ID3D12CommandList::OMSetRenderTargets:
Specified CPU descriptor handle ptr=0x0000000000000000 does not refer to
a location in a descriptor heap. pRenderTargetDescriptors[0] is the issue.
[ EXECUTION ERROR #646: INVALID_DESCRIPTOR_HANDLE]
Fixes the following error in the D3D12 debug layer:
D3D12 WARNING: ID3D12Device::CreateCommittedResource:
Ignoring InitialState D3D12_RESOURCE_STATE_UNORDERED_ACCESS.
Buffers are effectively created in state D3D12_RESOURCE_STATE_COMMON.
[ STATE_CREATION WARNING #1328: CREATERESOURCE_STATE_IGNORED]
Fixes the following error in the D3D12 debug layer:
D3D12 ERROR: ID3D12DescriptorHeap::GetGPUDescriptorHandleForHeapStart:
GetGPUDescriptorHandleForHeapStart is invalid to call on a descriptor
heap that does not have DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE set.
If the heap is not supposed to be shader visible, then
GetCPUDescriptorHandleForHeapStart would be the appropriate method
to call. That call is valid both for shader visible and non shader
visible descriptor heaps.
[ STATE_GETTING ERROR #1315: DESCRIPTOR_HEAP_NOT_SHADER_VISIBLE]
When searching for a disc where the revision doesn't match any disc in
the datfile, the loop would never get to the part where serials_exist is
set to true, leading to a bogus error message.
Because of the previous commit, `regs_in_use` must not include `dest_reg`
when calling MMIOLoadToReg. There are also some other registers we can
skip including in regs_in_use just for efficiency's sake.
The `addr_reg_set = false` statements that I've added in this commit are
technically redundant – if `mmio_address` is non-zero then `addr_reg_set`
is already false – but it's just a coincidence that that's the case.
I originally added these in 2b1d1038a6, for both the TPipelineFunction and the size. The size was moved into the header in fdcd2b7d00 (making the size functions obsolete), but it seems that the functions themselves are no longer needed now.
I think I didn't use this approach before because it would have required ComponentFormatTable and ComponentCountRow to be templated, which would end up resulting in lines that were too long and thus wrapped in awkward places. (I *think* they didn't get inferred properly.) Now that we only need TPipelineFunction, the templating is not needed, and this ends up being a more readable version of the version with the wrapper functions.
The old calculation was stride * (max_index + 1), which fails if stride is less than the size of a component (for instance, if float XYZ positions are used, and the stride was set to 4 (i.e. sizeof(float)) instead of 12 (i.e. 3 * sizeof(float)), it would be missing the last 8 bytes of the final element in the array. Or, if stride was set to 0, then no bytes would be recorded at all (though that's not a useful configuration so it's unlikely to actually exist).
I'm not aware of any games affected by this issue.
This should fix recording the wall in the staircase leading to the basement in Luigi's Mansion (though I haven't tested it, as I don't own a copy of Luigi's Mansion). This uses NormalIndex3, and the index for the normal vector (generally 0x02XX or 0x01XX) there is always lower than the tangent or binormal (generally 0x07XX). Other games seem to usually have a similar range of indices for the normal, tangent, and binormal, so this issue wouldn't affect them.
In most cases, games will use the same type for all vertex components (either Index8 or Index16 or Direct). However, RS2's deflection towers use Index16 for the texture coordinate and Index8 for everything else, meaning the texture coordinates were recorded incorrectly (the first byte was used, so only indices 0 and 1 were recorded instead of 0 through 0x0192). Worse still, some background elements in RS2 use direct positions but indexed normals or texture coordinates, and those would not be recorded at all.
This is a regression from b5fd35f951.
`count` is the number of stereo samples to write (where each stereo sample is two shorts), while `BUFFER_SIZE` is the size of the buffer in shorts. So `count` needs to be multiplied by `2`, not `BUFFER_SIZE`. Also, when this check was failed, the previous code just clobbered whatever was past the end of the buffer after logging the warning, which corrupted `basename`, eventually resulting in Dolphin crashing.
This affected Datel's Wii-compatible Action Replay, which uses a block size of 2298, or 18384 stereo samples, which is 36768 shorts, which is bigger than the buffer size of 32768. (However, the previous commit means that only one block is transfered at a time, eliminating this issue; fixing the bounds check is just a general safety thing instead of an actual bugfix now.)
The previous implementation of Force25BitPrecision was essentially a
translation of the x86-64 implementation. It worked, but we can make a
more efficient implementation by using an AArch64 instruction I don't
believe x86-64 has an equivalent of: URSHR. The latency is the same as
before, but the instruction count and register count are both reduced.
The new `dispatcher_no_timing_check` is the same as `dispatcher_no_check`
except it includes the "stepping check" in debug mode. This lets us avoid
the `m_enable_debugging ? dispatcher : dispatcher_no_check` dance.
Maybe "tail call" isn't quite the right term for what this code
is doing, since it's jumping to the dispatcher rather than
returning, but it's the same optimization as for a tail call.
fregsIn will include FD for double-precision instructions, since for
dependency tracking purposes the instruction does read the upper
half of FD. This is not what we want in HandleNaNs.
The consequence of this bug is that if an instruction was supposed to
output a NaN and FD happens to contain a NaN and FD happens to be the
same register as an unused register in the instruction encoding, the
NaN in FD could get used as the output instead of the correct NaN.
This isn't known to affect any games, which isn't especially surprising
considering that there's only one game that needs AccurateNaNs anyway.
Jumping to `dispatcher` requires first subtracting the downcount,
otherwise `dispatcher` may unpredictably jump to CoreTiming::Advance,
which could break determinism compatibility with JitArm64. We should
jump to `dispatcher_no_check` instead.
The breakpoint check in Jit.cpp makes it redundant.
Normally this redundant check doesn't cause any issues, but if you
create a breakpoint and enable logging without breaking, you get two
log messages if the breakpoint is at the beginning of a block. See
https://bugs.dolphin-emu.org/issues/13044.
This is also a tiny performance improvement for when debugging is
active, since we no longer check for breakpoints for blocks that never
had any breakpoints to begin with.
Nothing currently uses it. It could theoretically be replaced with fmt support, but I don't think the LOG_VULKAN_ERROR macro is that useful and it'd be better to replace it with regular logging instead.
base is an unsigned variable, so we can make things little more
consistent by making the loop index unsigned so we aren't doing bit
arithmetic with signed types.
MemoryInterface already does this, so we can leave it alone.
No behavioral changes, just a consistency thing.
Rather than makring some parts of VertexLoaderManager dirty in some places and some in others, do it all in VideoState. Also, since CPState no longer contains pointers/non-CP data after d039b1bc0d, we can just use p.Do on it instead of manually saving each field.
Micro-optimization. Some CPUs can fuse CMP+B, TST+B, arith+CBZ, etc.
I also moved things around for CMP+CSET and TST+CSET - which I'm not sure
if any CPUs support - but it doesn't hurt anything, so I might as well.
Improves accuracy but isn't known to affect any games.
This turned out to be fairly convenient to implement; ORing with the
PPC default NaN will quieten SNaNs and do nothing to QNaNs.
This existed in the initial megacommit (though I don't know why) as IO_SIZE. It was used in Memmap's Init() to compute totalMemSize, but I don't know if it actually did anything then. That use was removed in 2d0f714546, but the constant persisted until cc858c63b8, when it became a static variable.
This was added in 385d8e2b15, but became somewhat redundant with Do in 4c7bbd96e4, and completely redundant now that std::is_trivially_copyable_v is well-supported.
This is the first step of getting rid of the controller indirection
on Android. (Needing a way for touch controls to provide input
to the emulator core is the reason why the controller indirection
exists to begin with as far as I understand it.)
This lets the TAS input code use a higher-level interface for
overriding inputs instead of having to fiddle with raw bits.
WiiTASInputWindow in particular was messy with how much
controller code it had to re-implement.
Fixes a Rogue Squadron II regression from 9d73583.
This set_dirty stuff is pretty tricky to reason about. I thought I
was clever when coming up with set_dirty, but maybe I was too clever
for my own good...
In case the register we're binding is the same as the immediate register,
we should fetch the immediate before calling BindToRegister. The way
the register cache currently works, calling GetImm after BindToRegister
actually does work, but it's better to not rely on it.
Avoid waiting for earlier submissions when we flush more often.
The vertex manager will flush more often if the game accesses the EFB
on the CPU, to give the GPU a head start.
Before, only the symbols box would update. However, if you edit the symbol of a function in the call stack (which seems like something that would happen reasonably often while debugging), the call stack would be out of date until it was updated by clicking on it. Callers and calls were more of an edge case; for them to be out of date, you would need to right-click on an instruction in a function other than the one containing the currently-selected instruction (though it would also affect recursive functions).
Tested on an official DOL-014 (251 blocks) memory card by executing the
0xf4 command on a card with content along its entire length and then
dumping the whole card: it reads as 0xff all the way through.
Therefor, the current implementation is already consistent with hardware.
This reverts commit fb265b610d.
The optimization in that commit is safe when the executor thread is
writing and the GUI thread is reading, but I had failed to take into
account that it's unsafe when the GUI thread is writing and the executor
thread is reading. (The native UpdateAdditionalMetadata function loops
through m_cached_files, which is unsafe if another thread is adding
elements to m_cached_files simultaneously.)
Losing out on this optimization isn't too bad, because
719930bb39 makes it very unlikely that
both threads will want the lock at the same time.
Texture dumping can already be done using VideoCommon's system (and in fact the same setting already enabled *both* of these). Dumping objects/TEV stages/texture fetches doesn't currently have an equivalent, but could be added to the FIFO player instead.
A (partial) port of #9481 to ARM64. This commit adds special cases for
immediate values equal to 0 or 0xFFFFFFFF, allowing for more efficient
or no code to be generated.
When a guest register is an immediate, it may be necessary to move this
value into a register. This is handled by gpr.R(), which lacks context
on how the register will be used. This leads to cases where the
immediate is written to a register, only for it to be overwritten. Take
for example this code generated by srwx:
0x5280031b mov w27, #0x18
0x53187edb lsr w27, w22, #24
gpr.BindToRegister() does have this context through the do_load
parameter, but didn't handle immediates. By adding this logic, we can
intelligently skip the write when do_load is false.
Because of the previous commit, this is needed to stop DolphinQt from
forgetting that the user pressed ignore whenever any part of the config
is changed.
This commit also changes the behavior a bit on DolphinQt: "Ignore for
this session" now applies to the current emulation session instead of
the current Dolphin launch. This matches how it already worked on
Android, and is in my opinion better because it means the user won't
lose out on important panic alerts in a game becase they played another
game first that had repeated panic alerts that they wanted to ignore.
For Android, this commit isn't necessary, but it makes the code cleaner.
Not doing so produces a warning in clang:
ISO C++20 considers use of overloaded operator '!=' (with operand types
'Metal::DepthStencilSelector' and 'Metal::DepthStencilSelector') to be
ambiguous despite there being a unique best viable function with
non-reversed arguments
The underlying reason for this warning is an incorrect method signature.
* 'hangle' was a typo
* Light colors include an alpha value, so they should be 8 characters, not 6
* The XF command format adds 1 to the count internally (so 0 is one word), but we need to subtract that back to produce a valid command
* XFMEM_POSTMATRICES was calculating the row by subtracting XFMEM_POSMATRICES (POS vs POST), resulting in incorrect row numbering
It stores both the konst selection value for alpha and color channels (for two tev stages per ksel), and half of a swap table row (there are 4 total swap tables, which can be used for swizzling the rasterized color and the texture color, and indices selecting which tables to use are stored per tev stage in the alpha combiner). Since these are indexed very differently, the old code was hard to follow.
The masking was incorrect. This affects the main menu of The Last Avatar, though that menu also relies on copy filter functionality that is not correctly handled in the software renderer so the difference is not obvious; that game shuffles textures across all indices for some reason, so this issue would presumably result in subtle flickering.
Per https://en.cppreference.com/w/cpp/preprocessor/replace#.23_and_.23.23_operators the `##` behavior is a nonstandard extension; this extension seems to be supported by all compilers we care about, but IntelliSense in visual studio doesn't correctly handle it, resulting in false errors in the IDE (but not when compiling).
Per https://en.cppreference.com/w/cpp/preprocessor/replace#Function-like_macros C++20 introduced a workaround, where `__VA_OPT__(, )` generates a comma if and only if `__VA_ARGS__` is non-empty.
This PR replaces all occurrences, with the exception of Externals, DSPSpy (which is not likely to be edited in MSVC and does not target C++20 currently), and JitArm64_Integer.cpp (which uses `Function(__VA_ARGS__)`, and thus does not ever need a comma).
Fixes https://bugs.dolphin-emu.org/issues/13017. With uCode switching, the existing instance of AXUCode is re-activated when GBAUCode is done, but if the state remains as WaitingForNextTask, it won't be able to do anything. Instead, it needs to be in WaitingForCmdListSize.
(When the AX uCode is resumed, startpc is set to 0x0030, at least for 0x07f88145; this is the same location as MAIL_RESUME jumps to, so DSP_RESUME should be sent when the resuming happens; that's already handled by AXUCode::Update.)
walking the zip prevents minizip from re-reading the same
data repeatedly from the actual backing filesystem.
also improves most usages of minizip to allow for >4GB,
files altho we probably don't need it
dir_path is used by PanicAlertFormatT, which prior to PR 10209 used a
lambda. Before c++20, referring to structured bindings in lambda captures
was forbidden. The problem is now doubly fixed, so put the structured
binding back in.
Fixes the Dolphin bug mentioned in
https://github.com/dolphin-emu/hwtests/issues/45.
Because this doesn't fix any observed behavior in games (no, 1080°
Avalanche isn't affected), I haven't implemented this in the JITs,
so as to not cause unnecessary performance degradations.
This was causing a bug in the rounding of paired single multiplication
operands. If Force25BitPrecision was called for quad registers, the
element size of its ADD instruction would get treated as if it was 16
instead of the intended 64, which would cause the result of the
calculation to be incorrect if the carry had to pass a 16-bit boundary.
Fixes one of the two bugs reported in
https://bugs.dolphin-emu.org/issues/12998.
This command does not upload the MAIN buffers to CPU memory. This was
functionally fixed in f11a40f858 without
updating the comments and variable names.
Prior to 7854bd7109, this was used by the debugger for the OpenGL and D3D9 plugins to control logging (via PRIM_LOG and INFO_LOG/DEBUG_LOG in VideoCommon code; PRIM_LOG was changed in 77215fd27c), and also framedumping (removed in 64927a2f81 and 2d8515c0cf), shader dumping (removed in 2d8515c0cf and this commit), and texture dumping (removed in 54aeec7a8f). Apart from shader dumping, all of these features have modern alternatives, and shader source code can be seen in RenderDoc if "Enable API Validation Layers" is checked (which also enables source attachment), so there's no point in keeping this around.
Previously, we had WBFS and CISO which both returned an upper bound
of the size, and other formats which returned an accurate size. But
now we also have NFS, which returns a lower bound of the size. To
allow VolumeVerifier to make better informed decisions for NFS, let's
use an enum instead of a bool for the type of data size a blob has.
For a few years now, I've been thinking it would be nice to make Dolphin
support reading Wii games in the format they come in when you download
them from the Wii U eShop. The Wii U eShop has some good deals on Wii
games (Metroid Prime Trilogy especially is rather expensive if you try
to buy it physically!), and it's the only place right now where you can
buy Wii games digitally.
Of course, Nintendo being Nintendo, next year they're going to shut down
this only place where you can buy Wii games digitally. I kind of wish I
had implemented this feature earlier so that people would've had ample
time to buy the games they want, but... better late than never, right?
I used MIT-licensed code from the NOD library as a reference when
implementing this. None of the code has been directly copied, but
you may notice that the names of the struct members are very similar.
c1635245b8/lib/DiscIONFS.cpp
Needed for the next commit. NFS disc images are hashed but not encrypted.
While we're at it, also get rid of SupportsIntegrityCheck.
It does the same thing as old IsEncryptedAndHashed and new HasWiiHashes.
This normalization was added in 02ac5e95c8, and changed to use floats in 4bf031c064. The conversion to floats means that sometimes there is insufficient precision for the normalization process, which results in values of NaN or infinity. Performing the whole process with doubles prevents that, but games also sometimes set the values to NaN or infinity directly (possibly accidentally due to the values not being initialized due to them not being used in the current configuration?).
The version of Mesa currently in use on FifoCI (20.3.5) has issues with NaN. Although this bug has been fixed (b3f3287eac in 21.2.0), FifoCI is stuck with the older version.
This change may or may not be incorrect, but it should result in the same behavior as already present in Dolphin, while working around the Mesa bug.
CARDUCode, GBAUCode, and INITUCode previously didn't have an implementation of it. In practice it's unlikely that this caused an issue, since these uCodes are only active for a few frames at most, but now that GBAUCode doesn't have global state, we can implement it there. I also implemented it for CARDUCode, although our CARDUCode implementation does not have all states handled yet - this is simply future-proofing so that when the card uCode is properly implemented, the save state version does not need to be bumped. INITUCode does not have any state to save, though.
The accuracy improvements are:
* The request mail must be 0xabba0000 exactly; both the low and high parts are checked
* The address is masked with 0x0fffffff
* Before, the global state meant that after the GBA uCode had been used once, it would accept 0xcdd1 commands immediately. Now, it only accepts them after execution has finished.
* moves dolphin-specific settings out of Base.props
* creates exports.props for externals, allowing to easily import
individual Externals
* corrects some cruft that accumulated and probably contributed
to msbuild overbuilding
These lookup tables total 4 megabytes, and contain data that's entirely redundant to the actual cache state (as part of an optimization, though I'm not sure whether the optimization actually is useful). This change instead recomputes these lookup tables when loading the state (which involves filling the lookup table with a marker (0xff), and then setting the 128 * 8 valid entries (1 kilobyte)).