Normally, SI is polled at a rate defined by the game, and we have to send the pad state to other clients on every poll or else we'll desync. This can result in fairly high bandwidth usage, especially with multiple controllers, mostly due to UDP/IP overhead.
This change introduces an option to reduce the SI poll rate to once per frame, which may introduce up to one frame of additional latency, but will reduce bandwidth usage substantially, which is useful for users on very slow internet connections.
Polling SI less frequently than the game asked for did not seem to cause any problems in my testing, so this should be perfectly safe to do.
We can just use a vector of a vector, which also has the benefit of
keeping the size accounted for as well, allowing us to get rid of a
count parameter for CodesToHeader().
Instead of using an out-reference, we can modernize these to return the
std::string directly. While we're at it, also remove the unused name
parameter.
Given we now use a base class for the interface, we can make all member
functions, types and constants that aren't directly related to
instructions private.
HID2.LSQE is the Load/store quantize enable bit for non-indexed format
instructions (which are psq_l, psq_lu, psq_st, and psq_stu). If this bit
is not set and any of these instructions are attempted to be executed,
then a program exception is supposed to occur.
This register is defined as "optional reserved" within the aarch64 ABI.
Linux doesn't use it, but we must not modify it on ios or windows.
As we have plenty of registers on aarch64, let's just always skip this one.
This function was duplicated across all the opcode tables: the main info
tables, the interpreter tables, and the x86-64 JIT tables. However, we
can just make the type of the std::array parameter a template type and
get rid of this duplication.
const on a parameter being passed by value in a prototype doesn't actually signify
anything, these are only applicable in the definition, where they make
the opcode parameter immutable.
inline has external linkage, which doesn't really make sense here, given
the function is only used within this translation unit. So we can
replace inline with static.
While we're at it, the code within the function can also be compressed
to a single return statement.
Previously these were required to be built into the executable so that
the JIT portion of the DSP code would build properly, as the
x86-64-specifics were tightly coupled to the DSP common code. As this is
no longer the case, this is no longer necessary.
This adds a base class that is used to replace the concrete instance of
the x64 JIT pointer within DSPCore. This fully removes the direct use
(read: non-ifdefed) usage of x86-64-specifics within the main DSP code.
Said base can also be used for creating JITs for other architectures,
such as AArch64, etc.
This is one of the last things that needed to be done in order to
finally separate the x86-64-specific code from the rest of the common
DSP code. This splits the tables up similar to how it's currently done
for the PowerPC CPU tables.
Now, the tables are split up and within their own relevant source files,
so the main table within the common DSP code acts as the "info" table
that provides specifics about a particular instruction, while the other
tables contain the actual instruction.
With this out of the way, all that's left is to make a general base for
the emitters and we can then replace the x64 JIT pointer in DSPCore with
it, getting all x64 out of the common code once and for all.
While shuffling all the code around, the removal of the DSPEmitter
includes in some places uncovered indirect inclusions, so this also
fixes those as well.
Despite both being documented as read-only registers, only one of them
is truly read-only. An mtspr to HID1 will steamroll bits 0-4 with
bits 0-4 of whatever value is currently in the source register, the rest
of the bits are not modified as bits 5-31 are considered reserved, so
these ignore writes to them.
PVR on the other hand, is truly a read-only register. Attempts to write
to it don't modify the value within it, so we model this behavior.
This makes it much more straightforward to access WiimoteDevice
instances and also keeps the implementation details of accessing those
instances in one spot.
Given as all external accesses to the WiimoteDevice instances go through
this function, we can make the other two private.
Using reinterpret_cast (or a C-styled equivalent) to reinterpret
integers as floating-point values and vice-versa invokes undefined
behavior. Instead, use BitCast, which does this in a well-defined
manner.
According to PEM 3.3.6.1, if a division by zero occurs and FPSCR.ZE is
set, then the result of the instruction operation is unchanged (see
table 3-13). Similarly, if an invalid operation occurs and FPSCR.VE is
set, then the destination should also remain unchanged (see table 3-12).
Hardware also matches this behavior.
We were handling this for other relevant instructions, but we weren't
doing so for the arithmetic instructions. This corrects that.
This also alters our NI_* functions to return an FPResult type, which
allows us to see which kind of exception in particular is set in
exceptional cases. This is necessary for cases like the fdiv
instructions, which requires handling both ZE and VE being potentially
set.
These can be moved into the RegisterColumn constructor, which avoids
potential allocations in the case a std::function would otherwise need
to allocate to hold all of it's captured data.
Also tidy up the inclusion order while we're at it.
Previously the class was intermixing m_ prefixed variables and
non-prefixed ones, which can be misleading. Instead, we make the
prefixing consistent across the board.
Selecting Dummy or Memory Card would pass wrong values to EXI::ChangeDevice and not work as expected
Changing path had no effect until device was changed as it didn't call EXI::ChangeDevice at all
Makes the values strongly-typed and gets more identifiers out of the
global namespace.
We are forced to use anything that is not "None" to mean none, because
X11 is garbage in that it has:
\#define None 0L
Because clearly no one else will ever want to use that identifier for
anything in their own code (and is why you should prefix literally
any and all preprocessor macros you expose to library users in public
headers).
Makes the enum values strongly-typed and prevents the identifiers from
polluting the PowerPC namespace. This also cleans up the parameters of
some functions where we were accepting an ambiguous int type and
expecting the correct values to be passed in.
Now those parameters accept a PowerPC::CPUCore type only, making it
immediately obvious which values should be passed in. It also turns out
we were storing these core types into other structures as plain ints,
which have also been corrected.
As this type is used directly with the configuration code, we need to
provide our own overloaded insertion (<<) and extraction (>>) operators
in order to make it compatible with it. These are fairly trivial to
implement, so there's no issue here.
A minor adjustment to TryParse() was required, as our generic function
was doing the following:
N tmp = 0;
which is problematic, as custom types may not be able to have that
assignment performed (e.g. strongly-typed enums), so we change this to:
N tmp;
which is sufficient, as the value is attempted to be initialized
immediately under that statement.
This changes the identifier to represent the x86-64 DSP emitter. If any
other JITs for the DSP are added in the future, they all can't use the
same generic identifier.
In cases where we just want a random value for a primitive arithmetic
type, we can wrap this in a template to allow convenient direct
assignment instead of keeping declaration and initialization separate
(making it more difficult to use values uninitialized). This also allows
the use of Common::Random with functions such as std::generate, making
it more flexible in how random values can be generated.
This is only ever used internally. Also change the std::string name over
to a const char*, so that we don't need to potentially allocate anything
on the heap at immediate runtime.
Previously, a total of 114 std::string instances would need to construct
(allocating on the heap for larger strings that can't be stored with
small string optimizations). We can just use an array of const char*
strings instead, which allows us to avoid this.
Given JitBase shouldn't include platform specifics, we can generalize this
preprocessor define and allow any JIT to use it to indicate that generated code should be logged.
While we're at it, also move these defines beneath the includes with the
rest of the defines.
Rather than introduce this handling in every system instruction that modifies
the FPSCR directly, we can instead just handle it within the data structure
instead, which avoids duplicating mask handling across instructions.
This also allows handling proper masking from the debugger register
windows themselves without duplicating masking behavior there either.
ChunkFile doesn't use any of the file utilities, so we can drop these
headers to avoid pulling in unnecessary dependencies. This also
uncovered a few indirect inclusions.
This only queries internal state, it doesn't modify it. With minor
adjustments to BTEmu, this also allows us to make its usage instance a
constant reference.
The required version of MSVC already supports [[maybe_unused]], so we
can utilize this here. When GCC 7 and clang 3.9 become hard
requirements, we can eliminate this macro entirely and replace it with
[[maybe_unused]].
UNUSED is quite a generic macro name and has potential to clash with
other libraries, so rename it to DOLPHIN_UNUSED to prevent that, as well
as make its naming consistent with the force inline macro
This is much better as prefixed double underscores are reserved for the
implementation when it comes to identifiers. Another reason its better,
is that, on Windows, where __forceinline is a compiler built-in, with
the previous define, header inclusion software that detects unnecessary
includes will erroneously flag usages of Compiler.h as unnecessary
(despite being necessary on other platforms). So we define a macro
that's used by Windows and other platforms to ensure this doesn't
happen.
Instead of globbing things under an ambiguous Common.h header, move
compiler-specifics over to Compiler.h. This gives us a dedicated home
for anything related to compilers that we want to make functional across
all compilers that we support.
This moves us a little closer to eliminating Common.h entirely.
Rather than have a separate independent variable that we need to keep
track of in conjunction with the JIT code buffer size itself, amend the
analyst code to use the code buffer constant in JitBase.
Now if the size ever changes, then the analyst will automatically adjust
to handle it.
Given the code buffer is something truly common to all JIT
implementations, we can centralize it in the base class and avoid
duplicating it all over the place, while still allowing for differently
sized buffers.
Gets rid of an inclusion dependency with the DSP interpreter, as well as
a header-based dependency on the DSP opcode tables. This also uncovered
an indirect inclusion on the logger within DSPSymbols.cpp
As peculiar as this may be, decrementer exceptions by means of setting
the decrementer's zeroth bit from 0 to 1 is valid behavior by software
(and is defined in Programming Environments for 32-bit Microprocessors
in section 2.3.14.1 -- Decrementer operation). Given it's valid behavior,
it doesn't necessarily make sense to use a panic alert and halt, as this
isn't a condition where everything should be considered in a critical
state.
Instead, change it to an info log, so we still make note of it, but
without potentially tearing down state or halting emulation.
This fixes the The Last Story prototype that GerbilSoft was testing,
because the apploader is a bit more lenient with the max size of DOL
sections when it detects that you're using a devkit console.
Deduplicates code, and gets rid of some problems the old code had
(such as: bad performance when calling native functions, only one
disc showing up for multi-disc games, Wii banners being low-res,
unnecessarily much effort being needed for adding more metadata).
By making the jitted function a private static function of DSPEmitter,
we can allow access to data members within the context of the function
without making them public overall.
This finally makes all data members for the x64 DSP emitter private.
If we don't do this the prompt *may* appear behind the fullscreened window
and thus cause confusion. This happens both with exclusive fullscreen and
borderless fullscreen (e.g. for OpenGL).
This hardware behavior makes sense, as the FI bit is used to signify an
inexact result. An inexact result is a form of value that results during
the rounding phase of denormalization. If any bits of the significand
are lost during said rounding, then the result is considered to be
inexact.
However NaN and infinity are not classed as subnormals and therefore
don't undergo the denormalization step, making loss of precision not
possible (in NaN's case, numerically rounding something that is
literally Not a Number doesn't even make sense).
FR is set to indicate whether or not the last arithmetic or rounding and
conversion instruction that rounded the intermediate result incremented
the fractional portion of the result. Given neither input types would be
affected by this, this should also be unset.
This corrects more of the exceptional case handling for these values to
match hardware.
As suggested here: https://dolp.in/pr7059#pullrequestreview-125401778
More descriptive than having a std::tuple of FS::Mode, and lets us
give names to known triplets of modes (like in ES). Functions that
only forward mode arguments are slightly less verbose now too.
Prevents implicit conversions to types and requires explicitly
specifying them in order to construct instances of them. Given these are
used within emulation code directly, being explicit is always better
than implicit.
As explained within 179d73ac0d, the table
within the Programming Environments Manual for PowerPC lists the FI and
FR bits as cleared for invalid operation cases. So, we amend the
relevant cases here in order to be accurate to hardware.
As explained within commit a08ad82ace, if
an invalid exception occurs and VE is set, then the destination register
should remain unchanged. Ditto for when ZE is set and a zero divide
exception occurs.
This is only used internally, so we don't need to expose it in the
header. This also allows getting rid of inclusion of the byte swapping
utilities in the header as well.
Given they were only made public so that the callback could access class
state, we can simply make the callback a private static function of
CEXIMic, which allows access to members from the callback function
without making all of said members public.
In the PEM manual, within Table 3-12, which lists what should occur for
invalid operation exceptions, the FPSCR.FI and FPSCR.FR bits are listed
as "Cleared" for when FPSCR.VE is unset and set. So we clear these bits
as well to match hardware behavior.
In the PowerPC Microprocessor Family: The Programming Environments
Manual for 32 and 64-bit Microprocessors, in section 3.3.6.1, Table
3-12 lists what should occur if an invalid operation exception occurs in
situations where VE is set and when VE is not set. In the case where VE
is set, it lists the frD as "Unchanged". It also lists the FPRF flags as
"Unchanged".
Further down in Table 3-13, the listings for what should occur when zero
divide exceptions occur is listed, both for when ZE is set, and when it
isn't. When ZE is set, it lists frD as "Unchanged". It also lists the
FPRF flags as "Unchanged" as well.
This also alters the code so that we don't even calculate the result if
we don't need to compute it, making it a little bit less wasteful.
DataBinHeader is not used anywhere in the code other than via Header,
so let's merge them to reduce noise when accessing header fields
(currently we have to do header.hdr which looks silly).
It would make sense for 0x80 and 0xf0c0 to be respectively
sizeof(BkHeader) and sizeof(Header) as Nintendo is signing anything
that comes after the header, including the BkHeader.
The current WiiSave code is extremely messy, as it exposes all kinds of
implementation details in the header (including internal struct
definitions and magic numbers that don't have to be).
The read/write code is intermingled, so it's hard to tell which members
are used, or when/where they are set at all.
It also implicitly relies on some functions being called in a specific
order since it doesn't seek manually every time, which makes the code
even more fragile.
The logic is also hardcoded to only support bin->nand or nand->bin,
even though it would be useful to support nand->nand (for the
Movie save copying code, for example).
This commit attempts to solve these problems by getting rid of the
WiiSave class:
* Read/write code is moved to new Storage classes (NandStorage and
DataBinStorage) with small, clear functions that do one and only
one thing.
* The import/export logic was refactored into a generic Copy function
that takes two storages as parameters.
* The existing import and export functions are now just small wrappers
that call Copy with the appropriate storages.
This makes it easier to generate random numbers or fill a buffer with
random data in a cryptographically secure way.
This also replaces existing usages of RNG functions in the codebase:
* <random> is pretty hard to use correctly, and std::random_device does
not give enough guarantees about its results (it's
implementation-defined, non cryptographically secure and could be
deterministic on some platforms).
Doing things correctly is error prone and verbose.
* rand() is terrible and should not be used especially in crypto code.
Normalizes variable names to conform to our coding conventions.
Previously we were signifying some variables as externally linked
globals, which wasn't the case.
The definition of the function uses the ordering {mod, reg, rm}, which
is correct. Match the prototype to this, so that the parameter list
isn't misleading.
We can just use std::any_of here to collapse the checking code down to a
single assignment as opposed to a loop. This also slightly improves on
the existing code, as this won't continue to iterate through the cluster
metadata if an entry that's non-zero is encountered.
Pretty much all of the source files contain the following:
namespace IOS
{
namespace HLE
{
namespace <name>
{
// actual code here
} // namespace <name>
} // namespace HLE
} // namespace IOS
which is really verbose boilerplate, because most of the files inside
of Core/IOS are for IOS HLE.
This commit replaces that with a more concise `namespace IOS::HLE`
or `namespace IOS::HLE::(name)`.
This is just used as a means of carting around routines. It's not meant
to directly have functionality embedded within it--this is the job of
the inheriting data structure--so we can just make this a basic struct.
Particularly given all the data members were public to begin with.
Gets rid of the need to set up memcpy boilerplate to reinterpret between
floating-point and integers.
While we're at it, also do a minor bit of tidying.
Given this is what occurs in both constructors (as one just passes
through to another), we can just initialize the member directly.
While we're at it, amend the struct to follow the general ordering
convention of:
<new types>
<functions>
<variables>
Switching to blank NAND when emulation is running is an extremely bad
idea. It's akin to opening up a Wii and replacing the NAND chip while
you're playing a game on it.
Except we're not even replacing it with a NAND that has the same
contents. The blank NAND has nothing in it except the save file for
the current game, which is likely to result in the emulated software
getting inconsistent results and possibly even crashing depending on
how it caches title information.
An example of games that check the saves for other games is
Mario Kart Wii -- it checks the filesystem for Super Mario Galaxy saves
to decide whether to unlock characters. With this 'switch NAND
while emulation is active' misfeature, this will likely break.
And that's the main problem: it encourages sloppy emulation and no one
really knows how many things it can break.
Just don't let the user do horrible things like that during emulation.
If they want to use a blank NAND, they can do so by starting input
recording before launching a game. It's likely they will want to do
this if they plan to share their DTM anyway.
Another bit of behavior that we weren't performing correctly is the
unsetting of FPSCR.FI and FPSCR.FR when FPSCR.ZX is supposed to be set.
This is supported in PEM's section 3.3.6.1 where the following is
stated:
"
When a zero divide condition occurs, the following actions are taken:
- Zero divide exception condition bit is set FPSCR[ZX] = 1.
- FPSCR[FR, FI] are cleared.
"
And so, this fixes that behavior.
FPSCR[ZX] is the bit defined to represent the zero divide exception
condition bit, and is defined as (according to PowerPC Microprocessor
Family: The Programming Environments Manual for 32 and 64-bit
Microprocessors, which will be referred to as "PEM" for the rest of this
commit message) at section 3.3.6.1:
"
A zero divide exception condition occurs when a divide instructions is
executed with a zero divisor value and a finite, nonzero dividend value
or when a floating reciprocal estimate single (fres) or a floating
reciprocal square root estimate (frsqrte) instruction is executed with a
zero operand value.
"
Note that it states the divisor must be zero and the dividend must be
nonzero in order for ZX to be set. This means that the interpreter was
performing the wrong behavior for the case where 0/0 (with any sign on
the zeros) is performed. We would incorrectly set the ZX bit when only
the VXZDZ bit should be set.
It's also worth pointing out that N/0 (where N is any finite nonzero
value) and 0/0 are not within the same exception class. N/0 is a zero
divide exception case, while 0/0 is considered an invalid operation
exception case, which is also indicated in the PEM section 3.3.6.1 as
well where it lists the criteria for invalid operation exceptions.
Therefore we should only be setting the VXZDZ bit in the 0/0 case, not
VXZDZ and ZX. This was also verified via hardware tests to ensure that
this behavior indeed holds.
Fairly trivial to resolve, we just initialize the std::array with two
sets of braces (one set to create the array, the other to start and end the
aggregate data that we'll end up returning)
Given this is actually a part of the Host interface, this should be
placed with it.
While we're at it, turn it into an enum class so that we don't dump its
contained values into the surrounding scope. We can also make
Host_Message take the enum type itself directly instead of taking a
general int value.
After this, it'll be trivial to divide out the rest of Common.h and
remove the header from the repository entirely
If invalid operation exceptions are enabled and an invalid operation
occurs, then the destination value remains untouched. This fixes issues
that may arise when using these two instructions where the destination
gets steamrolled by an infinity or NaN value.
If a NaN of any type is passed as the operand to either of these
instructions, we shouldn't go down the regular code path, as we end up
potentially setting the wrong flags. For example, we wouldn't set the
FPSCR.VXCVI bit properly. We'd also set FPSCR.FI, when in actuality it
should be unset.
If an SNaN is passed as an operand, we also need to set the FPSCR.VXSNAN
bit as well.
The flag setting behavior for these can be found in Appendix C.4.2 in
PowerPC Microprocessor Family: The Programming Environments Manual for
32 and 64-bit Microprocessors.
fctiwz functions in the same manner as fctiw, with the difference being
that fctiwz always assumes the rounding mode being towards zero. Because
of this, we can implement fctiwz in terms of fctiw's code, but modify it
to accept a rounding mode, allowing us to preserve proper behavior for
both instructions.
We also move Helper_UpdateCR1 to a temporary home in
Interpreter_FPUtils.h for the time being. It would be more desirable to
move it to a new common header for all the helpers, so that even JITs
can use them if they so wish, however, this and the following changes
are intended to only touch the interpreter to keep changes minimal for
fixing instruction behavior.
JitCommon already duplicates the Helper_Mask function within
JitBase.cpp/.h, and the ARM JIT includes the Interpreter header in order
to call Helper_Carry. So a follow up is best suited here, as this
touches two other CPU backends.
We can just memcpy the data instead of pointer-casting data, which is
alignment-safe and doesn't run afoul of aliasing rules.
Previously it also made it seem as if data itself pointed to valid
usable data, but it doesn't, it simply functions as an out parameter
where we push data built up from the GetState() functions into it.
This was added in 4bdb4aa0d1 back in
2009-02-27. The only usage spot of this macro involves the same checks
that were used to define that preprocessor macro, so we can simply
remove the macro