The idea of this code was to not unroll loops, but it was completely broken.
So we've unrolled all loops, but only up to the second iteration.
Honestly, a better check would test if we branch to code which is already in the compiling block. But this is out of scope for now.
But testing shows that this unrolling actually improve the performance. So instead of fixing this bug, this check can be dropped.
Most settings which affect determinism will now be synced on NetPlay.
Additionally, there's a strict sync mode which will sync various
enhancements to prevent desync in games that use EFB reads.
This also adds a check for all players having the IPL.bin file, and
doesn't load it for anyone if someone is missing it. This prevents
desyncs caused by mismatched system fonts.
Additionally, the NetPlay window was getting too wide with checkboxes,
so FlowLayout has been introduced to make the checkboxes take up
multiple rows dynamically. However, there's some minor vertical
centering issues I haven't been able to solve, but it's better than a
ridiculously wide window.
This is accomplished by having SConfig::GetDirectoryForRegion no longer
return nullptr, as doing that was kind silly, considering we never
check for nullptr.
Basically everything here was race conditions in Qt callbacks, so I changed the client/server instances to std::shared_ptr and added null checks. It checks that the object exists in the callback, and the shared_ptr ensures it doesn't get destroyed until we're done with it.
MD5 check would also cause a segfault if you quit without cancelling it first, which was pretty silly.
This option completely disabled the DCBZ instruction. Users are toggling
this option in dolphin forks and using that same problematic config when
launching dolphin. Removing the option from dolphin will let the config be
ignored.
This adds the functionality of sending the host's save data (raw memory
cards, as well as GCI files and Wii saves with a matching GameID) to
all other clients. The data is compressed using LZO1X to greatly reduce
its size while keeping compression/decompression fast. Save
synchronization is enabled by default, and toggleable with a checkbox
in the NetPlay dialog.
On clicking start, if the option is enabled, game boot will be delayed
until all players have received the save data sent by the host. If any
player fails to receive it properly, boot will be cancelled to prevent
desyncs.
Our usage of glFinish() can cause driver crashes and/or lockups.
Please note that this disables the background shader compilation (i.e.
all shaders will be compiled on boot). There is no way around this.
Reduces the amount of dependencies dragged in by the main window's
header. This also removes MainWindow.h includes elsewhere where they
aren't necessary, reducing the amount of UI files that need to be
recompiled if the main window's header changes.
We don't want to write this setting to disk, as SConfig has problems
with leaking settings changed by GameINI into the base configs. The
result of this is that if someone plays an N64 VC game (or other game
where we disable this setting) the branch following option can get
unintentionally disabled globally, which will reduce performance in
many games and cause NetPlay to desync with users who still have it
enabled.
Lessens the amount of files that have to be recompiled if
ConfigManager.h is modified. This also removes an indirect inclusion
within DolphinQt/Main.cpp.
AVFormatContext::filename was deprecated in lavf 58.7.100 in favor
of AVFormatContext::url. Instead of adding version-checking logic,
just use the passed-in dump path instead.
Makes it less error-prone to get state data from analog sticks (no need
to pass any locals), and also allows direct assignment, letting the
retrieved data be const.
Makes it less error-prone to get state data from tilt controls (no need
to pass any pointers to locals), and also allows direct assignment,
letting the retrieved data be const.
Makes it less error-prone to get state data from sliders (no need
to pass any locals), and also allows direct assignment, letting the
retrieved data be const.
Makes it less error-prone to get state data from cursors (no need
to pass any pointers to locals), and also allows direct assignment,
letting the retrieved data be const.
Makes it less error-prone to get state data from analog sticks (no need
to pass any locals), and also allows direct assignment, letting the
retrieved data be const.
Ensures they match their naming within the definition of the function.
In EmulateSwing's case, one parameter was erroneously named tilt_group,
when it's actually supposed to be swing_group.
These aren't necessary in the prototype, however they do apply in the
definition of the function. This just cuts down on line noise within the
prototypes.
This is only ever read from externally, so we can expose a getter that ensures that
immutability, while making the actual instance internal. Given the
filling out of these settings depends on packets received by the client
instance, it makes more sense to make it a part of the client itself.
This trims off one lingering global.
This stops clients randomly deadlocking when a spectator leaves, as the mappings construct is not thread-safe and should not be written while the game is running.
Avoids dragging in a bunch of includes from the header files, and also
reduces the amount of files that need to be recompiled if one of those
included headers' source content is ever changed.
Previously we wouldn't indicate if saving or loading these files
happened to fail. In some cases we'd only print out to the logger, but
this is a pretty poor way to tell a user of the interface that something
went wrong in a direct way (the logging messages aren't able to be localized
either).
This packet is only used by the host to detect desyncs, and we don't really need to know the exact frame we desynced on (unless you're debugging, but you can just recompile for that), so it's perfectly fine to just send it less often. This makes it so the timebase packet is sent only every 60 frames, rather than every frame, which further cuts back on unnecessary bandwidth consumption.
PowerPCState's cr_val member is an array of u64s, so we can just use the
correct printf macro specifier within cinttypes. This also avoids
truncation on operating systems that use an LLP64 data model (like
Windows), where long is actually 32 bits in size, not 64-bit, which
could result in wonky values being printed, should Trace ever be used on
it.
define how many frames constitute a high or a low swing/shake when the
button is down. Also configurable is the number of frames to execute
the swing/shake after the button is released.
When disabled only inputs from TAS dialog are used.
When enabled inputs from TAS dialog are used, except when a change in
input is detected from a real controller, in this case the TAS value is
replaced with the real controller value.
Previously there was only one function under the NetPlay namespace,
which is kind of silly considering we have all of these other types
and functions existing outside of the namespace.
This moves the rest of them into the namespace.
This gets some general names, like Player, for example, out of the global namespace.
Works around a bug in QtCore that will cause crashes when
QFileSystemWatcher::addPath is called on a directory that is located on a
removable device (USB mass storage devices, etc.)
This should make the NetPlay dialog appear as a separate window in the taskbar on most systems, which makes more sense than a parented dialog as the user will leave it open for an extended period.
We want this setting to invalidate the cache because it may affect the appearance of textures in the rendered scene, therefore one would expect changing it while the game is running to have the expected effect immediately.
Initializing GraphicsWindow layout & children requires cooperation from
the graphics stack: on my system, for example, it causes a Vulkan
context to get created in order to get driver info. This is a slow
operation, and right now it is taking about 60-70% of the Dolphin
startup time on my system.
Move instead to a lazy-initialization model where the constructor
does nothing, instead offloading work to a separate Initialize() method
called before the window is shown.
I would expect this should be done for other larger parts of the UI,
especially the ones where creating widgets ends up triggering large IO
subsystems (I suspect controller configuration might be doing that).
(I'm not super happy with how this is implemented, but right now it's a
one-off, and it's a major complaint users have with the new UI. I
prioritized getting something working quickly...)
Now the detection heuristic has changed, the old value is no longer
valid.
Some example thresholds for known mipmap effects that should trigger:
SMG's lava has a mimimum difference of ~17.8, SMG2's clouds have a
minimum difference of ~14.8, and Wind Waker's foam has a minimum
difference of ~15
Non-triggering examples were tested and all had a calculated difference
lower than 3.
So a value of 14 should lean towards false-negatives instead of
positives, but this is clearly incomplete testing and may require
further tweaks later.
This no longer converts from sRGB to linear for the reference mip
downsample - even if the original mipmap creation tool used an sRGB
colorspace (which isn't really guaranteed, and may even change per
game), this is a "fast" heuristic that's only an estimate anyway.
The average diff is also now stored in a u64, avoiding floating point
calculations in the per-pixel hot loop.
This should speed up the detection significantly, hopefully fixing
jank when loading in new textures.
Normally, SI is polled at a rate defined by the game, and we have to send the pad state to other clients on every poll or else we'll desync. This can result in fairly high bandwidth usage, especially with multiple controllers, mostly due to UDP/IP overhead.
This change introduces an option to reduce the SI poll rate to once per frame, which may introduce up to one frame of additional latency, but will reduce bandwidth usage substantially, which is useful for users on very slow internet connections.
Polling SI less frequently than the game asked for did not seem to cause any problems in my testing, so this should be perfectly safe to do.
Given we now use a base class for the interface, we can make all member
functions, types and constants that aren't directly related to
instructions private.
HID2.LSQE is the Load/store quantize enable bit for non-indexed format
instructions (which are psq_l, psq_lu, psq_st, and psq_stu). If this bit
is not set and any of these instructions are attempted to be executed,
then a program exception is supposed to occur.
This register is defined as "optional reserved" within the aarch64 ABI.
Linux doesn't use it, but we must not modify it on ios or windows.
As we have plenty of registers on aarch64, let's just always skip this one.
This function was duplicated across all the opcode tables: the main info
tables, the interpreter tables, and the x86-64 JIT tables. However, we
can just make the type of the std::array parameter a template type and
get rid of this duplication.
const on a parameter being passed by value in a prototype doesn't actually signify
anything, these are only applicable in the definition, where they make
the opcode parameter immutable.
inline has external linkage, which doesn't really make sense here, given
the function is only used within this translation unit. So we can
replace inline with static.
While we're at it, the code within the function can also be compressed
to a single return statement.
Previously these were required to be built into the executable so that
the JIT portion of the DSP code would build properly, as the
x86-64-specifics were tightly coupled to the DSP common code. As this is
no longer the case, this is no longer necessary.
This adds a base class that is used to replace the concrete instance of
the x64 JIT pointer within DSPCore. This fully removes the direct use
(read: non-ifdefed) usage of x86-64-specifics within the main DSP code.
Said base can also be used for creating JITs for other architectures,
such as AArch64, etc.
This is one of the last things that needed to be done in order to
finally separate the x86-64-specific code from the rest of the common
DSP code. This splits the tables up similar to how it's currently done
for the PowerPC CPU tables.
Now, the tables are split up and within their own relevant source files,
so the main table within the common DSP code acts as the "info" table
that provides specifics about a particular instruction, while the other
tables contain the actual instruction.
With this out of the way, all that's left is to make a general base for
the emitters and we can then replace the x64 JIT pointer in DSPCore with
it, getting all x64 out of the common code once and for all.
While shuffling all the code around, the removal of the DSPEmitter
includes in some places uncovered indirect inclusions, so this also
fixes those as well.
Despite both being documented as read-only registers, only one of them
is truly read-only. An mtspr to HID1 will steamroll bits 0-4 with
bits 0-4 of whatever value is currently in the source register, the rest
of the bits are not modified as bits 5-31 are considered reserved, so
these ignore writes to them.
PVR on the other hand, is truly a read-only register. Attempts to write
to it don't modify the value within it, so we model this behavior.
This makes it much more straightforward to access WiimoteDevice
instances and also keeps the implementation details of accessing those
instances in one spot.
Given as all external accesses to the WiimoteDevice instances go through
this function, we can make the other two private.
Using reinterpret_cast (or a C-styled equivalent) to reinterpret
integers as floating-point values and vice-versa invokes undefined
behavior. Instead, use BitCast, which does this in a well-defined
manner.
According to PEM 3.3.6.1, if a division by zero occurs and FPSCR.ZE is
set, then the result of the instruction operation is unchanged (see
table 3-13). Similarly, if an invalid operation occurs and FPSCR.VE is
set, then the destination should also remain unchanged (see table 3-12).
Hardware also matches this behavior.
We were handling this for other relevant instructions, but we weren't
doing so for the arithmetic instructions. This corrects that.
This also alters our NI_* functions to return an FPResult type, which
allows us to see which kind of exception in particular is set in
exceptional cases. This is necessary for cases like the fdiv
instructions, which requires handling both ZE and VE being potentially
set.
These can be moved into the RegisterColumn constructor, which avoids
potential allocations in the case a std::function would otherwise need
to allocate to hold all of it's captured data.
Also tidy up the inclusion order while we're at it.
Previously the class was intermixing m_ prefixed variables and
non-prefixed ones, which can be misleading. Instead, we make the
prefixing consistent across the board.
Selecting Dummy or Memory Card would pass wrong values to EXI::ChangeDevice and not work as expected
Changing path had no effect until device was changed as it didn't call EXI::ChangeDevice at all
Makes the values strongly-typed and gets more identifiers out of the
global namespace.
We are forced to use anything that is not "None" to mean none, because
X11 is garbage in that it has:
\#define None 0L
Because clearly no one else will ever want to use that identifier for
anything in their own code (and is why you should prefix literally
any and all preprocessor macros you expose to library users in public
headers).
Makes the enum values strongly-typed and prevents the identifiers from
polluting the PowerPC namespace. This also cleans up the parameters of
some functions where we were accepting an ambiguous int type and
expecting the correct values to be passed in.
Now those parameters accept a PowerPC::CPUCore type only, making it
immediately obvious which values should be passed in. It also turns out
we were storing these core types into other structures as plain ints,
which have also been corrected.
As this type is used directly with the configuration code, we need to
provide our own overloaded insertion (<<) and extraction (>>) operators
in order to make it compatible with it. These are fairly trivial to
implement, so there's no issue here.
A minor adjustment to TryParse() was required, as our generic function
was doing the following:
N tmp = 0;
which is problematic, as custom types may not be able to have that
assignment performed (e.g. strongly-typed enums), so we change this to:
N tmp;
which is sufficient, as the value is attempted to be initialized
immediately under that statement.
This changes the identifier to represent the x86-64 DSP emitter. If any
other JITs for the DSP are added in the future, they all can't use the
same generic identifier.
In cases where we just want a random value for a primitive arithmetic
type, we can wrap this in a template to allow convenient direct
assignment instead of keeping declaration and initialization separate
(making it more difficult to use values uninitialized). This also allows
the use of Common::Random with functions such as std::generate, making
it more flexible in how random values can be generated.
This is only ever used internally. Also change the std::string name over
to a const char*, so that we don't need to potentially allocate anything
on the heap at immediate runtime.
Previously, a total of 114 std::string instances would need to construct
(allocating on the heap for larger strings that can't be stored with
small string optimizations). We can just use an array of const char*
strings instead, which allows us to avoid this.
Given JitBase shouldn't include platform specifics, we can generalize this
preprocessor define and allow any JIT to use it to indicate that generated code should be logged.
While we're at it, also move these defines beneath the includes with the
rest of the defines.
Rather than introduce this handling in every system instruction that modifies
the FPSCR directly, we can instead just handle it within the data structure
instead, which avoids duplicating mask handling across instructions.
This also allows handling proper masking from the debugger register
windows themselves without duplicating masking behavior there either.