This is a little trick I came up with that lets us restructure our float
classification code so we can exit earlier when the float is normal,
which is the case more often than not.
First we shift left by 1 to get rid of the sign bit, and then we count
the number of leading sign bits. If the result is less than 10 (for
doubles) or 7 (for floats), the float is normal. This is because, if the
float isn't normal, the exponent is either all zeroes or all ones.
With this, situations where multiple arguments need to be moved
from multiple registers become easy to handle, and we also get
compile-time checking that the number of arguments is correct.
This is needed so that the checks added in the previous commit will be
reevaluated if the value of m_enable_dcache changes.
JitArm64 was already recompiling its asm routines on cache clear by
necessity. It doesn't have the same setup as Jit64 where the asm
routines are in a separate region, so clearing the JitArm64 cache
results in the asm routines being cleared too.
This value is used in a multiplication. The result of this
multiplication is then subtracted from m_base. By negating m_dec, we are
free to use an addition instead.
On x64, this saves an instruction.
A deep-copy method CopyReader has been added to BlobReader (virtual) and all of its subclasses (override). This should create a second BlobReader to open the same set of data but with an independent read pointer so that it doesn't interfere with any reads done on the original Reader.
As part of this, IOFile has added code to create a deep copy IOFile pointer onto the same file, with code based on the platform in question to find the file ID from the file pointer and open a new one. There has also been a small piece added to FileInfo to enable a deep copy, but its only subclass at this time already had a copy constructor so this was relatively minor.
Resolve warning caused by using values from two different enums in a
conditional expression which was deprecated in c++20.
The warning in question is clang -Wdeprecated-anon-enum-enum-conversion
and gcc -Wenum-compare.
We had one implementation of this type of data structure in Arm64Emitter
and one in VideoCommon. This moves the Arm64Emitter implementation to
its own file and adds begin and end functions to it, so that VideoCommon
can use it.
You may notice that the license header for the new file is CC0. I wrote
the Arm64Emitter implementation of SmallVector, so this should be no
problem.
This fixes a problem that started happening in CoreTimingTest after the
previous commit. CPUThreadConfigCallback registers a Config callback
only once per run of the process, but CoreTimingTest calls
Config::Shutdown after each test, and Config::Shutdown was clearing all
callbacks, preventing the callback from running after that.
In theory, our config system supports calling Set from any thread. But
because we have config callbacks that call RunAsCPUThread, it's a lot
more restricted in practice. Calling Set from any thread other than the
host thread or the CPU thread is formally thread unsafe, and calling Set
on the host thread while the CPU thread is showing a panic alert causes
a deadlock. This is especially a problem because 04072f0 made the
"Ignore for this session" button in panic alerts call Set.
Because so many of our config callbacks want their code to run on the
CPU thread, I thought it would make sense to have a centralized way to
move execution to the CPU thread for config callbacks. To solve the
deadlock problem, this new way is non-blocking. This means that threads
other than the CPU thread might continue executing before the CPU thread
is informed of the new config, but I don't think there's any problem
with that.
Intends to fix https://bugs.dolphin-emu.org/issues/13108.
Android interprets char as unsigned char, so comparing with 0 triggers a
tautological-unsigned-char-zero-compare warning.
Casting c to an unsigned char and removing the comparison with 0
resolves the warning while needing one less comparison on all platforms.
Replace the bool parameter force5bytes in J, JMP, and J_CC with an enum
class Jump::Short/Near. Many callers set that parameter to the literal
'true', which was unclear if you didn't already know what it did.
The base DebugInterface now depends on the Core's CPUThreadGuard, and
utilities in Common shouldn't be depending on Core facilities. So, we
can move this into the core library instead.
Adds features to improve navigation of Skylanders portal menu, includes:
-List of Skylanders and filters for searching
-Improved buttons for faster loading options
-Added default user folder for storing .sky files
Previously this was using the default deleter (which just calls delete
on the pointer), which is incorrect, since the ENetHost instance is
allocated through ENet's C API, so we need to use its functions to
deallocate the host instead.
This isn't used anywhere and not really a generic utility, so we can get
rid of it.
This also lets us remove MathUtil.cpp, since this was the only thing
within that file.
Fixes us forgetting to add its include directories, which could result in linking to a dylib from MacPorts while using the system's header, and failing to link because they use different function names
Added AchievementSettings in Config with RA_INTEGRATION_ENABLED, RA_USERNAME, and RA_API_TOKEN. Includes code to load and store from Achievements.ini file in config folder.
This fixes a crash when recording fifologs, as the mutex is acquired when BPWritten calls AfterFrameEvent::Trigger, but then acquired again when FifoRecorder::EndFrame calls m_end_of_frame_event.reset(). std::mutex does not allow calling lock() if the thread already owns the mutex, while std::recursive_mutex does allow this.
This is a regression from #11522 (which introduced the HookableEvent system).
This second stack leads to JNI problems on Android, because ART fetches
the address and size of the original stack using pthread functions
(see GetThreadStack in art/runtime/thread.cc), and (presumably) treats
stack addresses outside of the original stack as invalid. (What I don't
understand is why some JNI operations on the CPU thread work fine
despite this but others don't.)
Instead of creating a second stack, let's borrow the approach ART uses:
Use pthread functions to find out the stack's address and size, then
install guard pages at an appropriate location. This lets us get rid
of a workaround we had in the MsgAlert function.
Because we're no longer choosing the stack size ourselves, I've made some
tweaks to where the put the guard pages. Previously we had a stack of
2 MiB and a safe zone of 512 KiB. We now accept stacks as small as 512 KiB
(used on macOS) and use a safe zone of 256 KiB. I feel like this should
be fine, but haven't done much testing beyond "it seems to work".
By the way, on Windows it was already the case that we didn't create
a second stack... But there was a bug in the implementation!
The code for protecting the stack has to run on the CPU thread, since
it's the CPU thread's stack we want to protect, but it was actually
running on EmuThread. This commit fixes that, since now this bug
matters on other operating systems too.
This very much isn't a build configuration that we're going to ship,
but I want to be able to tell people that they can build it on their
own if they really want to see how terribly it performs :)
Just like before, you'll need to edit two lines in app/build.gradle to
define ENABLE_GENERIC=ON and actually enable armeabi-v7a if you want an
armeabi-v7a build. This commit just fixes some compilations errors that
crop up if you do so.
This fixes a problem I was having where using frame advance with the
debugger open would frequently cause panic alerts about invalid addresses
due to the CPU thread changing MSR.DR while the host thread was trying
to access memory.
To aid in tracking down all the places where we weren't properly locking
the CPU, I've created a new type (in Core.h) that you have to pass as a
reference or pointer to functions that require running as the CPU thread.
While the NV extension is totally fine, the KHR extension should be able to support more hardware.
For NVIDIA, the hardware either supports both or neither, it just needs a driver from the last two years.
For AMD, the drivers from late 2022-12 seems to bring support for the KHR extension.
For Intel, the KHR is also supported for some years.
- Cancel doesn't shut down anymore.
Allowing it to be used multiple times thoughout the life of
the WorkQueue
- Remove Clear, so we only have Cancel semantics
- Add IsCancelling so work items can abort early if cancelling
- Replace m_cancelled and m_thread.joinable() guars with m_shutdown.
- Rename Flush to WaitForCompletion (As it's ambiguous if a function
called flush should be blocking or not)
- Add documentation
A lot of the remaining complexity in Renderer is the massive Swap function
which tries to handle a bunch of FrameBegin/FrameEnd events.
Rather than create a new place for it. This event system will try
to distribute it all over the place
Macros that expand to include the standard define macro are undefined.
This is pretty trivial to fix. We can just do the test and then define
the name itself if it's true, rather than making the set of definition
checks the macro itself.
Now that we've flipped the C++20 switch, let's start making use of
the nice new <bit> header.
I'm planning on handling this move away from BitUtils.h incrementally
in a series of PRs. There may be a few functions remaining in
BitUtils.h by the end that C++20 doesn't have any equivalents for.
The "vector shift by immediate" category encodes the shift amount for
right shifts as `size - amount`, whereas left shifts use `amount`.
We're not actually using SHRN/SHRN2 anywhere, which is why this has gone
undetected.
For quite some time now, we've had a setting on x86-64 that makes Dolphin
handle NaNs in a more accurate but slower way. There's only one game that
cares about this, Dragon Ball: Revenge of King Piccolo, and what that game
cares about more specifically is that the default NaN (or "generated NaN"
as I believe it's called in PowerPC documentation) is the same as on
PowerPC. On ARM, the default NaN is the same as on PowerPC, so for the
longest time we didn't need to do anything special to get Dragon Ball:
Revenge of King Piccolo working. However, in 93e636a I changed how we
handle FMA instructions in a way that resulted in the sign of NaNs
becoming inverted for nmadd/nmsub instructions, breaking the game.
To fix this, let's implement the AccurateNaNs setting, like on x86-64.
1. In some cases, ps_merge01 can be implemented using one instruction.
2. When we need two instructions for ps_merge01, it's best to start with
a MOV to avoid false dependencies on the destination register.
3. ps_merge10 can be implemented using a single EXT instruction.
This new function is like MOVP2R, except it masks out the lower 12 bits,
returning them instead of writing them to the register. These lower
12 bits can then be used as an offset for LDR/STR. This lets us turn
ADRP+ADD+LDR sequences with a zero offset into ADRP+LDR sequences with
a non-zero offset, saving one instruction.