This caused us to update the indirect texture information in shaders more often than we needed to, which probably doesn't matter in practice since it's only used in ubershaders and copyyscale and stride are generally only updated before EFB/XFB copies, which generally will have other changes afterwards.
As far as I can tell, it has nothing to do with the mipmap/half_scale functionality, but does change based on the width of the destination texture (and the destination texture is half the width if half_scale is set). The comment that was there (which dates back to the initial megacommit) seems to not have accounted for the width aspect; it was first used as an actual stride in bbbe898839 (the first commit that used it at all).
The move assignment operator for a class is implicitly deleted when the
class has a non-static reference data member, which is true of
WiiSocket's m_socket_manager member.
Explicitly declaring the operator as default generates a
-Wdefaulted-function-deleted warning on Clang.
Delete the move constructor as well for consistency.
Fix -WSwitch warning about unhandled enum value SDL_NUM_LOG_PRIORITIES.
log_level is initialized to LNOTICE right before the switch statement so
this doesn't cause any behavior changes.
Use RunOnCPUThread instead of RunAsCPUThread in BeginRecordingInput.
Most OpenGL functions require an OpenGL context to have been created on
that thread before calling the function; when that isn't the case they
return invalid results which can cause crashes when passed into other
functions.
Dolphin creates the OpenGL context in the EmuThread which then becomes
either the CPU-GPU thread or the Video thread for single and dual core
respectively. OpenGL functions must therefore be called from that
thread.
Movie::BeginRecordingInput is called from the Host thread and runs a
block of code which ultimately creates a savestate, which in turn embeds
the framebuffer which requires calling various OpenGL functions.
In single core the use of RunAsCPUThread leads to this all happening on
the Host thread, eventually leading to invalid OpenGL calls and a crash.
In Dual core the crash is avoided because VideoBackendBase::DoState uses
the AsyncRequests::DO_SAVE_STATE event which causes VideoCommon_DoState
and its subsequent OpenGL calls to safely run on the Video thread.
This commit uses RunOnCPUThread instead of RunAsCPUThread, which causes
the subsequent code to run on the CPU-GPU thread in single core which
has the valid OpenGL context and so doesn't crash.
This makes it so that if you just want to reload the current style (eg. on program start, or in response to a system event), you don't need to know the name of the currently selected user style. It's also more consistent with the way the 'userstyle/enabled' flag works.
Before dbf5dca, the dirty flag had no meaning for an immediate value,
so we made sure to always set the dirty flag when switching a register
from Immediate to Register. But after dbf5dca, that is no longer the
case. If an immediate is marked as not dirty, we can keep the register
marked as not dirty after materializing the value. This way we skip
having to write it back to ppcState later.
Without this change, non-dirty immediates don't actually get flushed.
This can be a problem if we for instance are flushing all registers in
order to execute an interpreter fallback. If that interpreter fallback
writes to a register that contained a non-dirty immediate, the JIT will
keep using the old value instead of loading the updated value.
This required a change in the denormal path where, instead of
subtracting 11 before shifting left, we shift left immediately and then
shift right by 11. This shouldn't affect performance.
Instead of combining X2 (the exponent) and X3 (the mantissa) using an
ORR instruction, we can read X1, which already contains both.
This requires us to reconstruct X1 in the denormal path, but that's
an acceptable price.
Dolphin's JITs have a minor terminology problem: The term "fastmem" can
refer to either the system of switching between a fast path and a slow
path using backpatching, or to the fast path itself. To hopefully make
things clearer, I'm adding some new terms, defining the old and new
terms as follows:
Fastmem: The system of switching from a fast path to a slow path by
backpatching when an invalid memory access occurs.
Fast access: A code path that accesses guest memory without calling C++
code.
Slow access: A code path that accesses guest memory by calling C++ code.
With this, situations where multiple arguments need to be moved
from multiple registers become easy to handle, and we also get
compile-time checking that the number of arguments is correct.
At the end of each frame automatically update the Current Value for
visible table rows in the selected and visible CheatSearchWidget (if
any). Also update all Current Values in all CheatSearchWidgets when the
State changes to Paused.
Only updating visible table rows serves to minimize the performance cost
of this feature. If the user scrolls to an un-updated cell it will
promptly be updated by either the next VIEndFieldEvent or the State
transitioning to Paused.
If dcache is enabled when the game starts, initializing the fastmem
arena is still useful in case the user changes the dcache setting.
And initializing it doesn't really cost anything.
Preparation for the next commit.
JitArm64 has been conflating these two flags. Most of the stuff that's
been guarded by fastmem_arena checks in fact requires fastmem.
When we have fastmem_arena without fastmem, it would be possible to do
things a bit more efficiently than what this commit does, but it's
non-trivial and therefore I will leave it out of this PR. With this
commit, we effectively have the same behavior as before this PR - plus
the added ability to toggle fastmem with a cache clear.
This is needed so that the checks added in the previous commit will be
reevaluated if the value of m_enable_dcache changes.
JitArm64 was already recompiling its asm routines on cache clear by
necessity. It doesn't have the same setup as Jit64 where the asm
routines are in a separate region, so clearing the JitArm64 cache
results in the asm routines being cleared too.
Some code paths in EmuCodeBlock.cpp that were checking fastmem_arena
should really also be checking m_enable_dcache.
Because JitArm64 centralizes more or less all memory access to the
EmitBackpatchRoutine function and because that function already
contained a check, JitArm64 works fine without the additional checks
added by this commit. Regardless, I added the checks to MMU.cpp instead
of EmuCodeBlock.cpp where applicable so they would be available to
JitArm64. Maybe one day JitArm64 will need them if its code gets
restructured.
The table only needs to be recreated when the displayed addresses might
change. If we're just refreshing the current values then update those
table cells and leave the rest of the table alone.
The code previously did this indirectly via `std::map<double, int>`, the key being the timestamp, which required a questionable workaround for the case where multiple states have the same timestamp. By having a particular combination of timestamps in the on-disk savestates, you could cause this workaround to infinitely loop, locking up Dolphin. This avoids this completely by refactoring the logic and just using `std::vector` instead.
Since ccf92a3e56, recording fifologs multiple times after launching dolphin caused all initial state to not be saved (the initial contents of bpmem, cpmem, etc were all zeroed out). For some games, this was not noticeable, as most registers were set each frame, but for others, this resulted in completely broken fifologs. (Note that recording fifologs also required 05181f6b88 and 9e0755a598 to be cherry-picked due to other, since fixed, regressions.)
This was because previously, `Renderer::CheckFifoRecording` was called every frame, but ccf92a3e56 changed it into a callback (`m_end_of_frame_event`) that was removed when recording ended. Thus, before, `OpcodeDecoder::g_record_fifo_data = IsRecording()` was called when `IsRecording()` returned false, but after that commit `g_record_fifo_data` never got changed back to false, so the check for `was_recording` only ever passed on the first fifolog recorded (even after stopping and starting a game).
There may still be another issue lurking, as I'm not sure if all broken fifologs were caused by recording multiple fifologs (for instance, on https://bugs.dolphin-emu.org/issues/13377, only one fifolog was initially uploaded, but it was affected by an issue with the same symptoms as this).
Instructions referencing registers r8-r15 take an additional byte to
encode. `reg_downcount` may be assigned to one of these registers, so it
is a small size win to store the downcount value in `RSCRATCH` first.
Before:
33 D2 xor edx,edx
44 8B 6D 64 mov r13d,dword ptr [rbp+64h]
45 85 ED test r13d,r13d
7E 30 jle 0000023546B43F6D
44 8B B5 D4 02 00 00 mov r14d,dword ptr [rbp+2D4h]
41 8B C5 mov eax,r13d
BF 07 00 00 00 mov edi,7
F7 F7 div eax,edi
After:
33 D2 xor edx,edx
8B 45 64 mov eax,dword ptr [rbp+64h]
85 C0 test eax,eax
7E 30 jle 000001AFBBAE359D
44 8B B5 D4 02 00 00 mov r14d,dword ptr [rbp+2D4h]
44 8B E8 mov r13d,eax
BF 07 00 00 00 mov edi,7
F7 F7 div eax,edi
This value is used in a multiplication. The result of this
multiplication is then subtracted from m_base. By negating m_dec, we are
free to use an addition instead.
On x64, this saves an instruction.
Split the "welcome" messages letting players know achievements are active into a separate method that gets called (currently) after a number of frames to ensure that the emulator has started properly and has somewhere to display the messages.
A new tab is added to the Achievements dialog to chart out the leaderboards in a table. Each row of the table contains the leaderboard information and up to four relevant entries, varying based on how many entries are in the leaderboard, whether or not the player has a submitted score, and where in the leaderboard the player's score is.
FetchBoardInfo is called (via the work queue asynchronously) on a leaderboard every time it is activated or submitted to. It makes two calls to the RetroAchievements API for fetching leaderboard info, one that requests the top four entries in the leaderboard and another that requests the player's entry, the two entries above the player and the two entries below. All of these are inserted into a single map (resolving any overlaps) so the result can be exposed to the UI.
The leaderboard map created here contains information useful to displaying leaderboard stats in the Achievement dialog, including each leaderboard's name and description and a partial list of entries for display. The entire map is exposed to the UI in a single call for simplicity.
Both of these functions access `m_game_data` and don't lock themselves, so they must be called in a way that guarantees that `m_game_data` is not modified during the call.
Scope issue in the event callback from `rc_runtime_do_frame()`. The pointer points to a variable on the stack from inside `rc_runtime_do_frame()`, so that's a race condition between the thread calling `rc_runtime_do_frame()` and the event queue thread.
Added handling to Achievement Progress Events, which are generated when an achievement with a Measured field updates in value. For example, an achievement for collecting 120 stars will throw this event when the player collects each star, and with this handling, the player will get a message on screen informing them of this progress. This message will only appear if the newly added RA_PROGRESS_ENABLED setting is true.
A deep-copy method CopyReader has been added to BlobReader (virtual) and all of its subclasses (override). This should create a second BlobReader to open the same set of data but with an independent read pointer so that it doesn't interfere with any reads done on the original Reader.
As part of this, IOFile has added code to create a deep copy IOFile pointer onto the same file, with code based on the platform in question to find the file ID from the file pointer and open a new one. There has also been a small piece added to FileInfo to enable a deep copy, but its only subclass at this time already had a copy constructor so this was relatively minor.
The achievement badges will now have a blue or gold border to identify whether they have been unlocked in softcore or hardcore mode. Similarly, the game badge will have a blue border if all achievements have been unlocked in either mode or a gold border if all achievements have been unlocked in hardcore mode.
Provided the badges are turned on in the settings, each achievement will have a badge next to it on the progress tab. There are different badges for locked and unlocked (usually locked is grayscale while unlocked is in color but not necessarily) and the badge chosen depends on the player's current unlock and hardcore status.
Provided badges are turned on, if there's a player logged in their RetroAchievements icon will appear next to their player info in the header of the Achievements dialog. If they're playing a game, so will the icon for the game. Also performed some refactoring and reorganizing to the header as a whole so that it looks consistent whether a game is running or not.
FetchBadges fetches all available badges (player, game, achievements) and stores them in AchievementManager's memory. New fields and accessors have been added as necessary. Badges for individual achievements are stored in the UnlockStatus map. The method is public so that the settings dialog can call it if badges are turned on after a game is started. Badges are deleted at game close and logout.
When evaluating whether jo.fastmem should be set to true, we check the
value of jo.fastmem_arena. However, due to a change made in 28e8117b90,
jo.fastmem_arena wasn't set until after the first time we set
jo.fastmem, so jo.fastmem would end up always being false until the next
time RefreshConfig was called.
Fixes https://bugs.dolphin-emu.org/issues/13364.
Image requests from RetroAchievements have a slightly different format from other requests, on account of being GET calls that return strings of image data, so this is a separate function that makes such calls. To handle this, I create a BadgeStatus object that ties each badge to a boolean flag for whether it is loaded and thus usable.
And in order to avoid a double dereference in the dispatcher, directly store
the normalEntry in the map.
The index to the block map becomes ((((DR<<1) | IR) << 30) | (address >> 2)).
This has been chosen since the msr bits change less often than the address,
thus we keep nearby entries together.
Also do not call the C dispatcher in case the assembly dispatcher didn't
find a block, since it wouldn't find a block either due to the 1:1 mapping,
except when falling back to the non shm segment lookup table.
We had two nearly identical copies of the gather pipe exception checking
code right next to each other, with the only difference being the
register used. Let's unify them, like in Jit64.
This bug has been lurking in the code ever since I added the discard
functionality. It doesn't seem to be triggered all that often,
and on top of that the emitted code only runs conditionally, so I'm not
sure if people have been affected by this bug in practice or not.
We need to check for the address of the *previous* instruction, because
checking fifoWriteAddresses happens not at the end of the instruction
that triggered it but at the start of the next instruction.
This refactors the Rich Presence generation to store to a member field that can be exposed to the UI to display the Rich Presence in the achievement header. It still updates at its original rate of once per two minutes, but calls an update on the dialog when that ticks.
Moved AchievementManager Init further down in the MainWindow constructor; its original position was because it had an impact on the contents of the menu bar, and this is no longer the case.
Whenever JitBaseBlockCache::Clear() got called, it threw away the memory mapping for the fast block map and created a new one. This new mapping typically got mapped at the same address at the old one, but this is not guaranteed. The pointer to the mapping gets embedded in the generated dispatcher code in Jit64AsmRoutineManager::Generate(), which is only called once on game boot, so if the new mapping ended up at a different address than the old one, the pointer in the ASM pointed at garbage, leading to a crash.
This fixes the issue by guaranteeing that the new mapping is mapped at the same address.
While both fastmem and the BLR optimization depend on fault handling,
the BLR optimization doesn't depend on fastmem, and there are cases
where you might want the BLR optimization but not fastmem. For me
personally, it's useful when I try to use a debugger on Android and have
to disable fastmem so I don't get SIGSEGVs all the time, but it would be
especially useful for iOS users.
When we execute a JIT block, we have to make sure that both the PC and
the DR/IR bits of MSR are the same as they were when the block was
compiled. When jumping to a block from the dispatcher, this is done in
the way you would expect: By checking the PC and the relevant MSR bits.
However, when returning to a block using the BLR optimization, we only
check the PC. Checking the MSR bits is done by instead resetting the
stack when the MSR changes, making PC checks afterwards fail.
Except... We were only resetting the stack on rfi instructions. There
are actually many more ways for the MSR to change, and we weren't
covering those at all. I looked into resetting the stack on all of them,
but it would be pretty cumbersome both in terms of writing the code and
in terms of how often at runtime we'd have to reset the stack, so I
think the better option would be to check the MSR bits along with the
PC. That's what this commit implements.
Because CPU thread config changed callbacks are no longer instant,
g_Config.iEFBScale doesn't yet contain the new value when the hotkey OSD
code tries to read it.
Should fix https://bugs.dolphin-emu.org/issues/13343.
There's no reason not to allow this now that these settings are
cleanly integrated into the new config system. (Actually, maybe
we could even have done this before the previous commit...)
This fixes a problem where changing the JIT debug settings on
Android while a game was running wouldn't cause the changed settings
to apply to code blocks that already had been compiled.
Resolve warning caused by using values from two different enums in a
conditional expression which was deprecated in c++20.
The warning in question is clang -Wdeprecated-anon-enum-enum-conversion
and gcc -Wenum-compare.
We had one implementation of this type of data structure in Arm64Emitter
and one in VideoCommon. This moves the Arm64Emitter implementation to
its own file and adds begin and end functions to it, so that VideoCommon
can use it.
You may notice that the license header for the new file is CC0. I wrote
the Arm64Emitter implementation of SmallVector, so this should be no
problem.
Modify PPCSTATE_OFF and PPCSTATE_OFF_ARRAY macros when using GCC to
avoid useless log spam. Specifically, use a consteval lambda with gcc
_Pragma statements to disable the -Winvalid-offsetof warning inside the
macros.
Each successful build (and many failing ones) on the Android buildbot
generates almost 300 cases of -Winvalid-offsetof, resulting in thousands
of lines of log spam per build. In addition to bloating the log filesize
these spurious warnings make it harder to find actual warnings.
These warnings are generated by calls to the macros PPCSTATE_OFF and
PPCSTATE_OFF_ARRAY, which in turn are used by many other macros used by
the JIT. The ultimate cause is that offsetof is only conditionally
supported on non-standard-layout types, which includes the PowerPCState
struct.
To address potential questions of whether there's a better way to handle
this:
The obvious solution would be to modify PowerPCState so that it does
have a standard layout. This is unfortunately impractical.
To have a standard layout a type can only contain other types with
standard layouts. None of the stl containers are guaranteed to have
standard layouts, and PowerPCState contains a std::tuple and std::array.
PowerPCState also contains a PowerPC::Cache and InstructionCache which
themselves contain std:arrays and std::vectors.
Furthermore InstructionCache derives from Cache, and a derived class can
only have standard layout if at most one class in its hierarchy has a
non-static data member, but both classes have such members. Making
InstructionCache have a standard layout would require duplicating all
the functionality of Cache so it no longer derived from it, as well as
replacing the stl containers. This might require having a raw pointer to
said containers, with the manual memory management that implies.
All of that would be much more disruptive than would be justified to get
rid of some warnings (however annoying they might be). This is
compounded by the fact that PowerPCState hasn't had a standard layout
for a long time, if ever, and if the PPCSTATE_OFF macros weren't working
reliably it would have become obvious a long time ago.
As to why I picked the lambda solution over other potential changes:
- Keeping the define as-is and wrapping some gcc #pragmas around it
doesn't work because the pragmas don't get included when the define is
substituted to the call site.
- Keeping the define as a non-lambda expression and using inline
_Pragma() statements would ideally be better and works fine for msvc,
but fails for GCC with "'#pragma' is not allowed here".
- Turning off -Winvalid-offsetof globally for gcc would work, but there
might be other contexts where offsetof is problematic and GCC seems to
be the only compiler warning about it.
This does require another register, but we skip having to use shifted
ADD, which takes two cycles on some CPUs, and we gain instruction-level
parallelism.