This hardware behavior makes sense, as the FI bit is used to signify an
inexact result. An inexact result is a form of value that results during
the rounding phase of denormalization. If any bits of the significand
are lost during said rounding, then the result is considered to be
inexact.
However NaN and infinity are not classed as subnormals and therefore
don't undergo the denormalization step, making loss of precision not
possible (in NaN's case, numerically rounding something that is
literally Not a Number doesn't even make sense).
FR is set to indicate whether or not the last arithmetic or rounding and
conversion instruction that rounded the intermediate result incremented
the fractional portion of the result. Given neither input types would be
affected by this, this should also be unset.
This corrects more of the exceptional case handling for these values to
match hardware.
As suggested here: https://dolp.in/pr7059#pullrequestreview-125401778
More descriptive than having a std::tuple of FS::Mode, and lets us
give names to known triplets of modes (like in ES). Functions that
only forward mode arguments are slightly less verbose now too.
Prevents implicit conversions to types and requires explicitly
specifying them in order to construct instances of them. Given these are
used within emulation code directly, being explicit is always better
than implicit.
As explained within 179d73ac0d, the table
within the Programming Environments Manual for PowerPC lists the FI and
FR bits as cleared for invalid operation cases. So, we amend the
relevant cases here in order to be accurate to hardware.
As explained within commit a08ad82ace, if
an invalid exception occurs and VE is set, then the destination register
should remain unchanged. Ditto for when ZE is set and a zero divide
exception occurs.
This is only used internally, so we don't need to expose it in the
header. This also allows getting rid of inclusion of the byte swapping
utilities in the header as well.
Given they were only made public so that the callback could access class
state, we can simply make the callback a private static function of
CEXIMic, which allows access to members from the callback function
without making all of said members public.
In the PEM manual, within Table 3-12, which lists what should occur for
invalid operation exceptions, the FPSCR.FI and FPSCR.FR bits are listed
as "Cleared" for when FPSCR.VE is unset and set. So we clear these bits
as well to match hardware behavior.
In the PowerPC Microprocessor Family: The Programming Environments
Manual for 32 and 64-bit Microprocessors, in section 3.3.6.1, Table
3-12 lists what should occur if an invalid operation exception occurs in
situations where VE is set and when VE is not set. In the case where VE
is set, it lists the frD as "Unchanged". It also lists the FPRF flags as
"Unchanged".
Further down in Table 3-13, the listings for what should occur when zero
divide exceptions occur is listed, both for when ZE is set, and when it
isn't. When ZE is set, it lists frD as "Unchanged". It also lists the
FPRF flags as "Unchanged" as well.
This also alters the code so that we don't even calculate the result if
we don't need to compute it, making it a little bit less wasteful.
DataBinHeader is not used anywhere in the code other than via Header,
so let's merge them to reduce noise when accessing header fields
(currently we have to do header.hdr which looks silly).
It would make sense for 0x80 and 0xf0c0 to be respectively
sizeof(BkHeader) and sizeof(Header) as Nintendo is signing anything
that comes after the header, including the BkHeader.
The current WiiSave code is extremely messy, as it exposes all kinds of
implementation details in the header (including internal struct
definitions and magic numbers that don't have to be).
The read/write code is intermingled, so it's hard to tell which members
are used, or when/where they are set at all.
It also implicitly relies on some functions being called in a specific
order since it doesn't seek manually every time, which makes the code
even more fragile.
The logic is also hardcoded to only support bin->nand or nand->bin,
even though it would be useful to support nand->nand (for the
Movie save copying code, for example).
This commit attempts to solve these problems by getting rid of the
WiiSave class:
* Read/write code is moved to new Storage classes (NandStorage and
DataBinStorage) with small, clear functions that do one and only
one thing.
* The import/export logic was refactored into a generic Copy function
that takes two storages as parameters.
* The existing import and export functions are now just small wrappers
that call Copy with the appropriate storages.
This makes it easier to generate random numbers or fill a buffer with
random data in a cryptographically secure way.
This also replaces existing usages of RNG functions in the codebase:
* <random> is pretty hard to use correctly, and std::random_device does
not give enough guarantees about its results (it's
implementation-defined, non cryptographically secure and could be
deterministic on some platforms).
Doing things correctly is error prone and verbose.
* rand() is terrible and should not be used especially in crypto code.
Normalizes variable names to conform to our coding conventions.
Previously we were signifying some variables as externally linked
globals, which wasn't the case.
The definition of the function uses the ordering {mod, reg, rm}, which
is correct. Match the prototype to this, so that the parameter list
isn't misleading.
We can just use std::any_of here to collapse the checking code down to a
single assignment as opposed to a loop. This also slightly improves on
the existing code, as this won't continue to iterate through the cluster
metadata if an entry that's non-zero is encountered.
Pretty much all of the source files contain the following:
namespace IOS
{
namespace HLE
{
namespace <name>
{
// actual code here
} // namespace <name>
} // namespace HLE
} // namespace IOS
which is really verbose boilerplate, because most of the files inside
of Core/IOS are for IOS HLE.
This commit replaces that with a more concise `namespace IOS::HLE`
or `namespace IOS::HLE::(name)`.
This is just used as a means of carting around routines. It's not meant
to directly have functionality embedded within it--this is the job of
the inheriting data structure--so we can just make this a basic struct.
Particularly given all the data members were public to begin with.
Gets rid of the need to set up memcpy boilerplate to reinterpret between
floating-point and integers.
While we're at it, also do a minor bit of tidying.
Given this is what occurs in both constructors (as one just passes
through to another), we can just initialize the member directly.
While we're at it, amend the struct to follow the general ordering
convention of:
<new types>
<functions>
<variables>
Switching to blank NAND when emulation is running is an extremely bad
idea. It's akin to opening up a Wii and replacing the NAND chip while
you're playing a game on it.
Except we're not even replacing it with a NAND that has the same
contents. The blank NAND has nothing in it except the save file for
the current game, which is likely to result in the emulated software
getting inconsistent results and possibly even crashing depending on
how it caches title information.
An example of games that check the saves for other games is
Mario Kart Wii -- it checks the filesystem for Super Mario Galaxy saves
to decide whether to unlock characters. With this 'switch NAND
while emulation is active' misfeature, this will likely break.
And that's the main problem: it encourages sloppy emulation and no one
really knows how many things it can break.
Just don't let the user do horrible things like that during emulation.
If they want to use a blank NAND, they can do so by starting input
recording before launching a game. It's likely they will want to do
this if they plan to share their DTM anyway.
Another bit of behavior that we weren't performing correctly is the
unsetting of FPSCR.FI and FPSCR.FR when FPSCR.ZX is supposed to be set.
This is supported in PEM's section 3.3.6.1 where the following is
stated:
"
When a zero divide condition occurs, the following actions are taken:
- Zero divide exception condition bit is set FPSCR[ZX] = 1.
- FPSCR[FR, FI] are cleared.
"
And so, this fixes that behavior.
FPSCR[ZX] is the bit defined to represent the zero divide exception
condition bit, and is defined as (according to PowerPC Microprocessor
Family: The Programming Environments Manual for 32 and 64-bit
Microprocessors, which will be referred to as "PEM" for the rest of this
commit message) at section 3.3.6.1:
"
A zero divide exception condition occurs when a divide instructions is
executed with a zero divisor value and a finite, nonzero dividend value
or when a floating reciprocal estimate single (fres) or a floating
reciprocal square root estimate (frsqrte) instruction is executed with a
zero operand value.
"
Note that it states the divisor must be zero and the dividend must be
nonzero in order for ZX to be set. This means that the interpreter was
performing the wrong behavior for the case where 0/0 (with any sign on
the zeros) is performed. We would incorrectly set the ZX bit when only
the VXZDZ bit should be set.
It's also worth pointing out that N/0 (where N is any finite nonzero
value) and 0/0 are not within the same exception class. N/0 is a zero
divide exception case, while 0/0 is considered an invalid operation
exception case, which is also indicated in the PEM section 3.3.6.1 as
well where it lists the criteria for invalid operation exceptions.
Therefore we should only be setting the VXZDZ bit in the 0/0 case, not
VXZDZ and ZX. This was also verified via hardware tests to ensure that
this behavior indeed holds.
Fairly trivial to resolve, we just initialize the std::array with two
sets of braces (one set to create the array, the other to start and end the
aggregate data that we'll end up returning)
Given this is actually a part of the Host interface, this should be
placed with it.
While we're at it, turn it into an enum class so that we don't dump its
contained values into the surrounding scope. We can also make
Host_Message take the enum type itself directly instead of taking a
general int value.
After this, it'll be trivial to divide out the rest of Common.h and
remove the header from the repository entirely
If invalid operation exceptions are enabled and an invalid operation
occurs, then the destination value remains untouched. This fixes issues
that may arise when using these two instructions where the destination
gets steamrolled by an infinity or NaN value.
If a NaN of any type is passed as the operand to either of these
instructions, we shouldn't go down the regular code path, as we end up
potentially setting the wrong flags. For example, we wouldn't set the
FPSCR.VXCVI bit properly. We'd also set FPSCR.FI, when in actuality it
should be unset.
If an SNaN is passed as an operand, we also need to set the FPSCR.VXSNAN
bit as well.
The flag setting behavior for these can be found in Appendix C.4.2 in
PowerPC Microprocessor Family: The Programming Environments Manual for
32 and 64-bit Microprocessors.
fctiwz functions in the same manner as fctiw, with the difference being
that fctiwz always assumes the rounding mode being towards zero. Because
of this, we can implement fctiwz in terms of fctiw's code, but modify it
to accept a rounding mode, allowing us to preserve proper behavior for
both instructions.
We also move Helper_UpdateCR1 to a temporary home in
Interpreter_FPUtils.h for the time being. It would be more desirable to
move it to a new common header for all the helpers, so that even JITs
can use them if they so wish, however, this and the following changes
are intended to only touch the interpreter to keep changes minimal for
fixing instruction behavior.
JitCommon already duplicates the Helper_Mask function within
JitBase.cpp/.h, and the ARM JIT includes the Interpreter header in order
to call Helper_Carry. So a follow up is best suited here, as this
touches two other CPU backends.
We can just memcpy the data instead of pointer-casting data, which is
alignment-safe and doesn't run afoul of aliasing rules.
Previously it also made it seem as if data itself pointed to valid
usable data, but it doesn't, it simply functions as an out parameter
where we push data built up from the GetState() functions into it.
This was added in 4bdb4aa0d1 back in
2009-02-27. The only usage spot of this macro involves the same checks
that were used to define that preprocessor macro, so we can simply
remove the macro
If any operand is a signaling NaN, we need to signify this by setting
the VXSNAN bit.
Fixes NaN flag setting for fmsub, fmsubs, fnmsub, fnmsubs, ps_msub, and
ps_nmsub instructions.
If any operand is a signaling NaN, we need to signify this by setting
the VXSNAN bit.
Fixes NaN flag setting for fmadd, fmadds, fnmadd, fnmadds, ps_madd,
ps_nmadd, ps_madds0, and ps_madds1
If either operand is a signaling NaN, we need to signify this by setting
the VXSNAN bit.
This fixes NaN flag setting for fsub, fsubs, and ps_sub instructions.
If either operand is a signaling NaN, we need to signify that by setting
the VXSNAN bit.
This fixes NaN flag setting for fdiv, fdivs and ps_div instructions.
These aren't used to modify the data they point to, so make that
explicit. Also while we're at it, add const to any nearby variables that
can be made so.
This might happen if someone moves settings between e.g. a PC and
an Android device, or if someone was using JITIL and updates Dolphin.
I also made the panic alert a bit more explanatory.
It's not used anywhere other than in DolphinQt2, where the usage is
incorrect and stupid since we shouldn't be trying to stop the core
and 'restore config' that was changed by the core at app exit time,
but immediately when the core is being shut down.
Makes all of the naming consistent with our code style, and makes
parameters match their header equivalents.
Essentially just a clean-up of things that weren't migrated over
already.
If either of the operands are signaling NaNs, then an invalid operation
exception needs to be indicated within the FPSCR.
This corrects SNaN flag setting for fmul, fmuls, ps_mul, ps_muls0, and
ps_muls1.
In old GCC versions, capturing 'this' does not work for some lambdas.
The workaround is to not use auto for the parameter (even though the
type is obvious). This can be dropped once we require GCC 7.
If the input is a signaling NaN, then we need to signal that via setting
the FPSCR.VXSNAN bit. We also shouldn't update the FPRF flags if
FPSCR.VE is set.
If the FPSCR.VE bit is set and an invalid operand is passed in, then the FPRF
shouldn't be updated. Similarly this is also the case when the FPSCR.ZE bit
is set and negative or positive zero is passed in as the operand.
If FPSCR.ZE is set and a divide by zero exception is signaled, then the
FPRF shouldn't be updated with a result. Similarly, if the input is an
SNaN and FPSCR.VE is set, then the FPRF shouldn't be updated.
The VX bit is intended to be a summary bit indicating the occurrence of
any kind of invalid operation. Therefore, whenever an invalid operation
exception is set, also set VX.
This corrects our CR flag setting for multiple instructions in certain
scenarios. This corrects flag setting cases in fadd, fadds, fctiw, fctiwz, fdiv,
frsp, frsqrte, fsub, and fsubs (and technically every floating-point
instruction that we make more accurate in the future with regards to
flag setting).
Coherent mappings have a lower overhead and less GL codes.
So enables coherent mapping by default for all drivers.
Both Qualcomm and ARM performs very bad with explicit flushing, so this change helps them as well.
AFAIK there was one GPU generation which was slower on coherent mapping: nvidia tesla
So Geforce 200 and 300 series should be tested with this PR before merging.
As this was last tested many years ago, this issue might have been fixed as well.
Those GPUs are close to 10 years old and not supported any more by nvidia.
This ports the Wii filesystem root, Wii SD card path and dump path
settings to the new config system (OnionConfig).
My initial plan was to wait until DolphinWX was removed before porting
most of the Main (Core, DSP, General) settings to onion config, but
I've decided to submit a small part of those changes to fix
[issue 10566](https://bugs.dolphin-emu.org/issues/10566).
Removes the need to manually set the FileUtil path in the UI frontends
and gets rid of some more members that don't really belong in SConfig.
Also fixes a bug which would cause the dump path not to get created
after change.
Only invoke config changed callbacks from Config::Save, not
Layer::Save. The latter results in callbacks being called
once per layer, up to 7 times per save.
These disc images are only used on dev units and not retail units.
There are two important differences compared to normal Wii disc images:
- The data starts 0x8000 bytes into each partition instead of 0x20000
- The data of a partition is stored unencrypted and contains no hashes
Our old implementation was just guesswork and doesn't work at all.
According to testing by GerbilSoft, this commit's implementation
is able to read and extract files in the filesystem correctly,
but the tested game still isn't able to boot. (It's thanks to their
info about unencrypted disc images that I was able to make this commit.)
Initialising Wii filesystem contents should be done after Boot and
not in HW to ensure that we operate with the correct title context
and to make sure required title directories exist (so that Movie and
Netplay code can copy data from and to the temporary NAND).
Previously, given cases such as 0x80000000 / 0xFFFFFFFF we'd incorrectly
set the destination register value to zero. If the dividend is negative,
then the destination should be set to -1 (0xFFFFFFFF), however if the
dividend is positive, then the destination should be set to 0.
Note that the 750CL documents state that:
"If an attempt is made to perform either of the divisions --
0x80000000 / -1 or <anything> / 0, then the contents of rD are
undefined, as are the contents of the LT, GT, and EQ bits of the CR0
field (if Rc = 1). In this case, if OE = 1 then OV is set."
So this is a particular behavior of the hardware itself.
D3D11 cannot handle block compressed textures where the first mip level
is not a multiple of the block size. The simple fix for texture pack
authors: leave these textures uncompressed. You can still use a .dds
container.
Executing a supervisor-level instruction in user mode is supposed to
cause a program exception to occur.
The following supervisor instructions are present:
- dcbi
- mfmsr
- mfspr
- mfsr
- mfsrin
- mtmsr
- mtspr
- mtsr
- mtsrin
- rfi
- tlbie
- tlbsync
In 0337ca116a checks within mfspr and
mtspr were added. This change adds the trivial checks to the other
instructions.
Using 8-bit integer math here lead to precision loss for depth copies,
which broke various effects in games, e.g. lens flare in MK:DD.
It's unlikely the console implements this as a floating-point multiply
(fixed-point perhaps), but since we have the float round trip in our
EFB2RAM shaders anyway, it's not going to make things any worse. If we
do rewrite our shaders to use integer math completely, then it might be
worth switching this conversion back to integers.
However, the range of the values (format) should be known, or we should
expand all values out to 24-bits first.
Keeps signed values out of bit arithmetic (not that there's any issues
that could arise from it in these situations, but it does look more
consistent, and silences compiler warnings)
Also ensure that all members of the class are initialized on
construction as well. Previously the bool indicating if options are
dirty wouldn't be initialized, which could be read uninitialized if an
instance was constructed and then IsDirty() is called.
Keeps all of the interpreter-specific exception handling functions
together in a reusable way across translation units, similar to
FPUtils.h for reusable floating-point functions.
Given this is a base class, we should clearly state what the parameters
to the functions in its exposed interface actually mean or represent.
This avoids needing to hunt for the definition of the functions in cpp
files.
While we're at it, normalize said parameter names so they follow our
naming guidelines.
There's no reason to use int here as opposed to an unsigned value.
Video_AccessEFB() takes its arguments as u32 values, so we'd be doing
sign conversions for no reason here (along with causing avoidable
compiler warnings).
If a program executing in user mode tries to write to any SPRs other than
XER, LR, or CTR registers, then a program exception occurs. Similarly
this also applies for reading SPRs as well, however the upper and lower
timebase halves can also be read (but not written to).
If HID0.NOOPTI is set, then dcbt and dcbtst are no-oped globally. We
currently don't perform data cache emulation, but we put this in anyway
so this detail isn't forgotten about if data cache emulation is
introduced at some point in the future.
This implements ES_VerifySign which is notably used by the system menu
when importing saves.
Now *all* ES commands that are actually used by titles are implemented.
- Move all of the ec functions into the Common::ec namespace.
- Give the public functions better names and some usage information.
- Move all of the "elt" related functions into an "elt" class including
all of the arithmetic operations, so that the logic becomes clearer
and feels less like assembly.
This also makes it much more obvious what the parameters are, instead
of only using unsigned char* (which doesn't tell anything about what
the pointer is used for or the size).
- Similarly, add a new "Point" class and move point functions there.
Overload the arithmetic operators to make calculations easier to read
The loops relied on unsigned integer overflow, which is not immediately
obvious. Replace them with less clever variants that are clearer.
Also implement bn_compare using std::memcmp.
This function in both JITs is only ever called by passing the JIT's code
buffer into it. Given this is already accessible, since the functions
are part of the respective JIT class, we can just remove this parameter.
This also cleans up accesses with the new code buffer, as we don't need
to do janky looking dereference-then-index expressions.
This class effectively acted as a "discount vector", that would simply
allocate memory and then delete it in the destructor when it goes out of
scope.
We can just use a std::vector directly to reduce this boilerplate.
ImportTitleDone only checks if all required contents have been imported
for system titles.
This fixes the system menu not being able to recreate title directories
to copy a save back to the NAND by using title import functionality.
Given this is a bitmask, we should be using an unsigned type to store it
(especially given it's outside the range an int can represent properly
without being considered negative).
No behavior change is caused by this, it just silences a sign conversion
warning.
Because it wasn't parented properly, it would show briefly the widget in its own window when creating an ARCodeWidget or a GeckoCodeWidget which would occur when accessing the game properties page or when the state changes to pause/running.
PowerPC.h at this point is pretty much a general glob of stuff, and it's
unfortunate, since it means pulling in a lot of unrelated header
dependencies and a bunch of other things that don't need to be seen by
things that just want to read memory.
Breaking this out into its own header keeps all the MMU-related stuff
together and also limits the amount of header dependencies being
included (the primary motivation for this being the former reason).
This fixes 2 crashes with the pause function. One is when spamming the pause hotkey and the other is to press pause and step hotkeys at the same time. It does disable the screensaver getting disabled when the emulator is running, but paused, though, a better solution would have to be done without introducing these crashes.
Github didn't detect conflicts here, however, since the float handling
functions were moved into the Common namespace, this would cause a build
failure.
Ideally none of these macros would exist (long-term goal), however in
the meantime at least make sure expressions always evaluate correctly
(thankfully no current usages rely on this).
Given we're operating with flags and bit representations, lets avoid
signed values here. It lessens the amount of sign conversion warnings
and lessens the amount of things to think about screwing you over when
making changes to the interpreter among other things.
Hopefully this better matches the user's view of a texture - as large changes in
colour should be weighted higher than lots of very small changes
Note: This likely invalidates the current heuristic threshold default
These can be expressed in a slightly cleaner manner without so many
casts. While we're at it, also get rid of unnecessary indexing (we
already have the result nearby).
Extracts the self-contained code into its own function to clean up the
flow of Jit() a little more.
This also introduces a helper function to HLE.h that will be used to
reduce the boilerplate here and in the interpreter and Jit64 in the
following commits.
This function performs all of the preliminary checks required prior to
attempting to hook/replace a function at a given address. The function then
calls a provided object that satisfies the FunctionObject concept in the
C++ standard library. This can be a lambda, a regular function pointer,
an object with an overloaded function call operator, etc. The only
requirement is that the function return a bool, indicating whether or
not the function was replaced, and that it can take parameters in the
form: fn(u32 function, HLE::HookType type)
Gets rid of a second pair of ifdefs in the constructor. This also makes
sure the fd on Unix/BSD platforms is uniformly initialized. Previously
fd would be in an inconsistent state on FreeBSD or OpenBSD due to the
BSD OS checks not being present in the #elif within the constructor.
Previously, the entirety of CEXIETHERNET was exposed publically, which
wasn't necessary. We simply make the thread function part of the
internal interface, which gives it access to internal data members,
while keeping everything else outside of it.
Given these HLE classes inherit from a common base with a virtual
destructor, override is more appropriate here, as virtual propagates to
these destructors anyway.
This is also safer. If the base class' destructor is ever made
non-virtual, then these classes will cause a compilation error if they
aren't taken into account, as they'd be overriding a non-virtual
function (the destructor).
This is to avoid several issue with using 2 actions and switching between them. This commit will instead have one action get his property changed on pause and play.
Paired single (ps) instructions can call asm_routines that try to update
PowerPC::ppcState.pc. At the time the asm_routine is built, emulation has
not started and the PC is invalid (0). If the ps instruction causes an
exception (e.g. DSI), SRR0 gets clobbered with the invalid PC.
This change makes the relevant ps instructions store PC before calling out
to asm_routines, and prevents the asm_routine from trying to store PC
itself.
Moves the codebuffer access variables closer to their first use, and
gets rid of multiple indexing expressions. We already know which op
we're accessing in particular, so just make a reference to it and access
it instead of duplicating the expression all over the place.
A call like ReplaceAddress(address, 0) is pretty ambiguous; so is
ReplaceAddress(address, false), so use an enum class that tells people
straight-up what the replacer is.
This also gets rid of the really weird naming, where if 'blr' is true,
we'd be replacing the address with a NOP, rather than an actual BLR
instruction, so we invert that so it actually makes sense. There's no
actual bug fixed here though, considering the OnInsert functions
specified the correct values; it's literally just weird naming.
Without this macro, if any signals or slots were attempted to be used,
they wouldn't work; neither would various other features of the Qt
meta-object system. This can also lead to weird behavior in other
circumstances. Qt's documentation specifically states:
"Therefore, we strongly recommend that all subclasses of QObject use the
Q_OBJECT macro regardless of whether or not they actually use signals,
slots, and properties."
on its page for "The Meta-Object System", which can be seen here:
https://doc.qt.io/qt-5/metaobjects.html
Let's opt for "always do the right thing", and keep the code extensible
for the future and not have random things blow up on us.