When trying to do a small optimization in 8a0f5ea, I failed to
take into account that WeakFlush and FlushOne update m_query_count.
Only D3D11 and OGL had this problem, not D3D12 and Vulkan.
Sorry, the fix I made to the empty string in a29660a was not
actually sufficient, as DolphinQt will call tr on the string
regardless of whether it's marked with _trans. The proper fix
is to use nullptr, which DolphinQt has a special check for.
Sending an empty string to the translation system will not
result in getting an empty string back, but rather a description
of the currently loaded translations file. So empty strings
should not be marked as translatable.
Also adding some i18n comments and rewording a string I thought
was hard to understand.
casting a value to a u32 when it's originally an int, and it's exposed as int to users,
could end up in cases where a negative number would result as a positive one.
This doesn't really affect the value range of the attachment enum,
still I think the code was wrong.
Heavily tested.
Settings.SECTION_INI_ANDROID and Settings.SECTION_BINDINGS
both have the value "Android", but we only want the former
to be marked as being handled by the new config system.
This change fixes a problem where controller settings were
not being properly saved to Dolphin.ini.
The STL has everything we need nowadays.
I have tried to not alter any behavior or semantics with this
change wherever possible. In particular, WriteLow and WriteHigh
in CommandProcessor retain the ability to accidentally undo
another thread's write to the upper half or lower half
respectively. If that should be fixed, it should be done in a
separate commit for clarity. One thing did change: The places
where we were using += on a volatile variable (not an atomic
operation) are now using fetch_add (actually an atomic operation).
Tested with single core and dual core on x86-64 and AArch64.
NumericSettings support a max, so let's use it.
It might not do much now, but the max and min values will be used to give visual feeback
in the UI in one of my upcoming input PRs
The control expression editor allows line breaks, but the serialization was
losing anything after the first line break (/r /n).
Instead of opting to encode them and decode them on serialization
(which I tried but was not safe, as it would lose /n written in the string by users),
I opted to replace them with a space.
and replacing it with a ":" prefix. Also remove white spaces and \n \t \r.
bugfix: fix EmulatedController::GetStateLock() not being aquired when reading the
expression reference
bugfix: MappingButton::UpdateIndicator() calling State(0) on outputs, breaking ongoing
rumbles if a game was running
Improvement: make expressions previews appear in Italic if they failed to parse correctly
Previously we set the texture coordinate to zero, now we set
the texture coordinate *index* to zero. This fixes the ripple
effect of the Mario painting in Luigi's Mansion.
This change should have no behavioral differences itself, but allows for changing the behavior of out of bounds tex coord indices more easily in the next commit. Without this change, returning tex0 for out of bounds cases and then applying the fixed-point logic would use the wrong tex dimension info (tex0 with I_TEXDIMS[1] or such), which is inaccurate.
Previously we set the texture coordinate to zero, now we set
the texture coordinate *index* to zero. This fixes the ripple
effect of the Mario painting in Luigi's Mansion.
Previously we set the texture coordinate to zero, now we set
the texture coordinate *index* to zero. This fixes the ripple
effect of the Mario painting in Luigi's Mansion.
Co-authored-by: Pokechu22 <Pokechu022@gmail.com>
Since the description updating is tied to the selection changing on the detail list, and the detail list is recreated on each object change, behavior was somewhat broken. Clearing the list changed the current row to zero, but nothing else (particularly m_object_data_offsets) had been updated, so the description was not necessarily correct (this is easier to observe now since the vertex data is at the end, so it's easier to get different lengths of register updates). Furthermore, subsequent clears did not update the current row since there was no visible selection, so it only changed the description once. The current row is now always set to zero, which forces an update (and also scrolls the list back to the top). The presence of FRAME_ROLE and OBJECT_ROLE are also checked so that the description is cleared if no object is selected.
- Only one search result is generated per command/line, even if there are multiple matches in that line.
- Pressing enter on the edit field begins a search, just like clicking the begin button.
- The next and previous buttons are disabled until a search is begun.
- The search results are cleared when changing objects or frames.
- The previous button once again works (a regression from the previous commit), and the register updates and graphics data for the correct object are searched.
- currentRow() never returns -1, so checking that is unnecessary (and misleading).
- The 'Invalid search parameters (no object selected)' previously never showed up before because FRAME_ROLE is present if and only if OBJECT_ROLE is present.
This way, it can be focused with the render window behind it, instead of having the main window show up and cover the render window. This is useful for adjusting the object range, among other things.
If the number of objects varied, this would result in either missing objects on some frames, or too many objects on some frames; the latter case could cause crashes. Since it used the current frame to get the count, if the FIFO is started before the FIFO analyzer is opened, then the current frame is effectively random, making it hard to reproduce consistently.
This issue has existed since the FIFO analyzer was implemented for Qt.
The 'zero frames in the range' check can be removed because now there is always at least 1 frame; of course that might be the same frame over and over again, but that's still useful for e.g. Free Look (and the 1 frame repeating effect already occurred when frame count was exclusive).
A single object can be selected instead of 2 (it was already inclusive internally), and the maximum value is the highest number of objects in any frame (minus 1) to reduce jank when multiple frames are being played back.
Now that this is only called when playback actually starts (and not on unpausing), this change makes the experience a bit better (no more missing objects from not having reset the from object after changing FIFOs).
It is no longer relevant for the current set of loaders after 7030542546. If it becomes relevant again, a static function named IsUsable or IsCompatibleWithCurrentMachine or something would be a better approach.
By taking advantage of three-operand IMUL, we can eliminate a MOV
instruction. This is a small code size win. However, due to IMUL sign
extending the immediate value to 64 bits, we can only apply this when
the magic number's most significant bit is zero.
To ensure this can actually happen, we also minimize the magic number by
checking for trailing zeroes.
Example (Unsigned division by 18)
Before:
41 BE E4 38 8E E3 mov r14d,0E38E38E4h
4D 0F AF F5 imul r14,r13
49 C1 EE 24 shr r14,24h
After:
4D 69 F5 39 8E E3 38 imul r14,r13,38E38E39h
49 C1 EE 22 shr r14,22h
This isn't entirely necessary, as they are interpreted as barewords expressions,
but it's still nicer to have by default. And my upcoming input changes will
always put `` around single letter inputs.
-Add pause state to FPSCounter.
-Add ability to have more than one "OnStateChanged" callback in core.
-Add GetActualEmulationSpeed() to Core. Returns 1 by default. It's used by my input PRs.
The SaveToSYSCONF call in BootManager.cpp was unintentionally
overriding the temporary NAND set by the preceding
InitializeWiiRoot call. Fixes
https://bugs.dolphin-emu.org/issues/12500.
Verifying a Wii game creates an instance of IOS, and Dolphin
can't handle more than one instance of IOS at the same time.
Properly supporting it is probably more effort than it's worth.
Fixes https://bugs.dolphin-emu.org/issues/12494.
Avoids the need to copy the *.mo files manually *and* more importantly
this ensures that the mo files are always recreated if the build
output directory is cleared.
Update references was failing to update the references, causing input to stay nullptr and crashing.
I fixed the case that triggered that, though also added checks against nullptrs for safety.
(cherry picked from commit 4bdcf707555a5568eddff957fa3604975ffb6ed7)
I think the AArch64 JIT has come far enough that it doesn't have to
be called experimental anymore.
I'm also labeling the x86-64 JIT as x86-64 for consistence with the
AArch64 JIT. This will especially be helpful if we start supporting
AArch64 on macOS, as AArch64 macOS can run both the x86-64 JIT and
the AArch64 JIT depending on whether you enable Rosetta 2.
I haven't observed this breaking any game, but it didn't match
the behavior of the interpreter as far as I could tell from
reading the code, in that denormals weren't being flushed.
If we can prove that FCVT will provide a correct conversion,
we can use FCVT. This makes the common case a bit faster
and the less likely cases (unfortunately including zero,
which FCVT actually can convert correctly) a bit slower.
Preparation for following commits.
This commit intentionally doesn't touch paired stores,
since paired stores are supposed to flush to zero.
(Consistent with Jit64.)
This simplifies some of the following commits. It does require
an extra register, but hey, we have 32 of them.
Something I think would be nice to add to the register cache
in the future is the ability to keep both the single and double
version of a guest register in two different host registers
when that is useful. That way, the extra register we write to
here can be read by a later instruction, saving us from
having to perform the same conversion again.
Fixes https://bugs.dolphin-emu.org/issues/12388. Might also fix
other games that have problems with float/paired instructions
in JitArm64, but I haven't tested any.
-They might have never drawn if DrawMessages wasn't called before they actually expired
-Their fade was wrong if the duration of the message was less than the fade time
This makes them much more useful for debugging, I know there might be other means
of debugging like logs and imgui, but this was the simplest so that's what I used.
If you want to print the same message every frame, but with a slightly different value
to see the changes, it now work.
To compensate for the fact that they are now always rendered once,
so on start up a lot of old messages (printed while the emulation was off) could show up,
I've added a "drop" time, which means if a msg isn't rendered for the first
time within that time, it will be dropped and never rendered.
When the interpreter writes to a discarded register, its type
must be changed so that it is no longer considered discarded.
Fixes a 62ce1c7 regression.
We normally check for division by zero to know if we should set the
destination register to zero with a XOR. However, when the divisor and
destination registers are the same the explicit zeroing can be omitted.
In addition, some of the surrounding branching can be simplified as
well.
Before:
45 85 FF test r15d,r15d
75 05 jne normal_path
45 33 FF xor r15d,r15d
EB 0C jmp done
normal_path:
B8 5A 00 00 00 mov eax,5Ah
99 cdq
41 F7 FF idiv eax,r15d
44 8B F8 mov r15d,eax
done:
After:
45 85 FF test r15d,r15d
74 0C je done
B8 5A 00 00 00 mov eax,5Ah
99 cdq
41 F7 FF idiv eax,r15d
44 8B F8 mov r15d,eax
done:
Division by a power of two can be slightly improved when the
destination and dividend registers are the same.
Before:
8B C6 mov eax,esi
85 C0 test eax,eax
8D 70 03 lea esi,[rax+3]
0F 49 F0 cmovns esi,eax
C1 FE 02 sar esi,2
After:
85 F6 test esi,esi
8D 46 03 lea eax,[rsi+3]
0F 48 F0 cmovs esi,eax
C1 FE 02 sar esi,2
Repeated erase() + iteration on a std::multimap is extremely slow.
Slow enough that it causes a 7 second long stutter during some
transitions in F-Zero X (a N64 VC game that triggers many, many icache
invalidations).
And slow enough that JitBaseBlockCache::DestroyBlock shows up on a
flame graph as taking >50% of total CPU time on the CPU-GPU thread:
https://i.imgur.com/vvqiFL6.png
This commit optimises those block link queries by replacing the
std::multimap (which is typically implemented with red-black trees)
with hash tables.
Master: https://i.imgur.com/vvqiFL6.png / 7s stutters
(starting from 5.0-2021 and with branch following disabled)
This commit: https://i.imgur.com/hAO74fy.png / ~0.7s stutters, which
is pretty close to 5.0 stable. (5.0-2021 introduced the performance
regression and it is especially noticeable when branch following
is disabled, which is the case for all N64 VC games since 5.0-8377.)
VideoCommon: Change the type of BPMemory.scissorOffset to 10bit signed: S32X10Y10
VideoBackends: Fix Software Clipper.PerspectiveDivide function, use BPMemory.scissorOffset instead of hard code 342
Oversight from #9545, which moved the "new game has been loaded" logic
to a separate OnNewTitleLoad function that has to be called explicitly
*after* a title has loaded.
Coupled with the commit that makes Dolphin not clobber 0x1800-0x3000
when using MIOS, this fixes Wind Waker and other MIOS-patched games
when they are launched from the System Menu.
MIOS puts patch data in low MEM1 (0x1800-0x3000) for its own use.
Overwriting data in this range can cause the IPL to crash when
launching games that get patched by MIOS.
See https://bugs.dolphin-emu.org/issues/11952 for more info.
Not applying the Gecko HLE patches means that Gecko codes will not work
under MIOS, but this is better than the alternative of having specific
games crash.
This particular range is kind of bizarre, and would only interpret
interleave mode 2 as a valid mode, while rejecting interleave mode 1 and
the extension byte mode.
As far as I know, based off the information on Wiibrew, we should be
considering all three values within this range as valid.
texture serialization and deserialization used to involve many memory
allocations and deallocations, along with many copies to and from
those allocations. avoid those by reserving a memory region inside the
output and writing there directly, skipping the allocation and copy to
an intermediate buffer entirely.
This adds a CMake option (DOLPHIN_DEFAULT_UPDATE_TRACK) to allow
configuring SCM_UPDATE_TRACK_STR. This is needed to enable auto-updates
in Windows CMake builds by default.
This adds a function to get the emulated or real Bluetooth device for
an active emulation instance. This lets us deduplicate all the
`ios->GetDeviceByName("/dev/usb/oh1/57e/305")` calls that are currently
scattered in the codebase and ensures Bluetooth passthrough is being
handled correctly.
This also fixes the broken check in WiimoteCommon::UpdateSource.
There was a confusion between "emulated Bluetooth" (as opposed to
"real Bluetooth" aka Bluetooth passthrough) and "emulated Wiimote".
Specifically, 'Scooby-Doo! Mystery Mayhem', 'Scooby-Doo! Unmasked', 'Ed, Edd n Eddy: The Mis-Edventures', and the Wii version of 'Happy Feet'.
The JIT cache causes problems with emulated icache invalidation in these games, resulting in areas failing to load.
This avoids some warnings, which were originally fixed by ignoring loads with a value of zero (see 636bedb207 / #3242).
Note that FifoCI will report some changes, but only on the first frame; these seem to be timing related as they don't happen if a different write is used to replace skipped ones.
They appear to relate to perf queries, and combining them with truely unknown commands would probably hide useful information. Furthermore, 0x20 is issued by every title, so without this every title would be recorded as using an unknown command, which is very unhelpful.
The swaps are confusing and don't accomplish much.
It was originally written like this:
u32 pte = bswap(*(u32*)&base_mem[pteg_addr]);
then bswap was changed to Common::swap32, and then the array access
was replaced with Memory::Read_U32, leading to the useless swaps.
While 6xx_pem.pdf §7.6.1.1 mentions that the number of trailing
zeros in HTABORG must be equal to the number of trailing ones
in the mask (i.e. HTABORG must be properly aligned), this is actually
not a hard requirement. Real hardware will just OR the base address
anyway. Ignoring SDR changes would lead to incorrect emulation.
Logging a warning instead of dropping the SDR update silently is a
saner behaviour.
debaf63fe8 moved the "Sonic epsilon hack"
to vertex shaders. However, it was only done for targets with depth
clamping. If this is not available, for example the target is OpenGL ES,
the Sonic problem appears (https://bugs.dolphin-emu.org/issues/11897).
A version of the "Sonic epsilon hack" is added for targets without
depth clamping.
This changes FileSystemProxy::Open to return a file descriptor wrapper
that will ensure the FD is closed when it goes out of scope.
By using such a wrapper we make it more difficult to forget to close
file descriptors.
This fixes a leak in ReadBootContent. I should have added such a class
from the beginning... In practice, I don't think this would have caused
any obvious issue because ReadBootContent is only called after an IOS
relaunch -- which clears all FDs -- and most titles do not get close
to the FD limit.
JitArm64::DoJit contains a check where it prints a warning and tries
to pause emulation if instructed to compile code at address 0. I'm
assuming this was done in order to provide a nicer error behavior
in cases where PC was accidentally set to null. Unfortunately, it
has started causing us problems recently, as 688bd61 writes and runs
some code at address 0 to simulate the PPC being held in reset.
What makes this worse is that calling Core::SetState from the CPU
thread is actually not allowed and will cause a deadlock instead of
the intended behavior. I don't believe there is anything on a real
console that would stop you from executing code at address 0 (as
long as the MMU has been set up to allow it), and Jit64::DoJit
doesn't contain any check like this, so let's remove the check.
This commit adds a new "discarded" state for registers.
Discarding a register is like flushing it, but without
actually writing its value back to memory. We can discard
a register only when it is guaranteed that no instruction
will read from the register before it is next written to.
Discarding reduces the register pressure a little, and can
also let us skip a few flushes on interpreter fallbacks.
The output of instructions like fabsx and ps_sel is store-safe
if and only if the relevant inputs are. The old code was always
marking the output as store-safe if the output was a single,
and never otherwise.
Also, the old code was treating the output of psq_l/psq_lu as
store-safe, which seems incorrect (if dequantization is disabled).
This improves the speed of verifying Wii WIA/RVZ files.
For me, the verification speed for LZMA2-compressed files
has gone from 11-12 MiB/s to 13-14 MiB/s.
One thing VolumeVerifier does to achieve parallelism is to
compute hashes for one chunk of data while reading the next
chunk of data. In master, when reading data from a Wii
partition, each such chunk is 32 KiB. This is normally fine,
but with WIA and RVZ it leads to rather lopsided read times
(without the compute times being lopsided): The first 32 KiB
of each 2 MiB takes a long time to read, and the remaining
part of the 2 MiB can be read nearly instantly. (The WIA/RVZ
code has to read the entire 2 MiB in order to compute hashes
which appear at the beginning of the 2 MiB, and then caches
the result afterwards.) This leads to us at times not doing
much reading and at other times not doing much computation.
To improve this, this change makes us use 2 MiB chunks
instead of 32 KiB chunks when reading from Wii partitions.
(block = 32 KiB, group = 2 MiB)
This can't actually happen in practice due to how WAD files work,
but it's very easy to add support for thanks to the last commit,
so we might as well add support for it.
The performance gains of doing this aren't too important since you
normally wouldn't run into any disc image that has overlapping blocks
(which by extension means overlapping partitions), but this change also
lets us get rid of things like VolumeVerifier's mutex that used to
exist just for the sake of handling overlapping blocks.
Panic alerts in DiscIO can potentially be very annoying since
large amounts of them can pop up when loading the game list
if you have some particularly weird files in your game list.
This was a much bigger problem back in 5.0 with its
"Tried to decrypt data from a non-Wii volume" panic alert, but
I figured I would take it all the way and remove the remaining
panic alerts that can show up when loading the game list.
I have exempted uses of ASSERT/ASSERT_MSG since they indicate
a bug in Dolphin rather than a malformed file.
If we know at compile time that the PPC carry flag definitely
has a certain value, we can bake that value into the emitted code
and skip having to read from PPCState.