My change in 867cd99 caused farcode to fill up much more than before.
Most games still ran fine, but certain games would have multiple
cache clears per minute, on top of being overall slow.
Maybe there's something prettier we can do about this than just
allocating more RAM, but we have the resource budget for making
Dolphin use more RAM, so I consider increasing the size of the
cache to be a good solution at least for the time being.
At least for Prince of Persia: The Forgotten Sands, 48 MB isn't
enough to stop the cache clears, but 64 MB is. (I only played the
game for a few minutes when testing, though.)
To do this, I had to decouple framebuffer fetch from shader blending. We need to be able to access framebuffer fetch input when using shader logic ops.
If InputConfig::LoadConfig() was called once with a non empty/customized config,
then called again after manually deleting the config (dolphin calls LoadConfig() every time it opens the mapping widget),
the second load would fail to clear the values on any non first EmulatedController and would instead keep the
previous config values despite it being deleted (while it would instead correctly default the first EmulatedController).
This is not a big bug though the code is better now.
Fixes bug: https://bugs.dolphin-emu.org/issues/12744
Before e1e3db13ba
the ControllerInterface m_devices_mutex was "wrongfully" locked for the whole Initialize() call, which included the first device population refresh,
this has the unwanted (accidental) consequence of often preventing the different pads (GC Pad, Wii Contollers, ...) input configs from loading
until that mutex was released (the input config defaults loading was blocked in EmulatedController::LoadDefaults()), which meant that the devices
population would often have the time to finish adding its first device, which would then be selected as default device (by design, the first device
added to the CI is the default default device, usually the "Keyboard and Mouse" device).
After the commit mentioned above removed the unnecessary m_devices_mutex calls, the default default device would fail to load (be found)
causing the default input mappings, which are specifically written for the default default device on every platform, to not be bound to any
physical device input, breaking input on new dolphin installations (until a user tried to customize the default device manually).
Default devices are now always added synchronously to avoid the problem, and so they should in the future (I added comments and warnings to help with that)
This fixes the bad rendering on the first frame when using the software renderer: the software renderer's Z buffer started out at 0, but most games clear it to 0xffffff instead; this means that things don't render correctly except for in the regions where the screen was cleared by an EFB copy earlier in the frame.
The system menu does clear the RTC flags, but we currently aren't updating the cache file, and since we clear them the system menu doesn't know to update the cache either. This means that launching a game via the system menu, and then launching a game directly and exiting via HOME will result in the system menu using an outdated cache and displaying the old game. This causes it to fail to launch the game on the disc channel (since it doesn't match the cache), resulting in it resetting (though it will ignore the cache after resetting). Not clearing the cache avoids this issue.
PNG_FORMAT_RGB and PNG_COLOR_TYPE_RGB both evaluate to 2, but PNG_FORMAT_RGBA evaluates to 3 while PNG_COLOR_TYPE_RGBA evaluates to 6; the bit indicating a palette is 1 while the bit indicating alpha is 4.
Fixes Bomberman Jetters in single core mode.
When single core mode pauses the CPU to execute the GPU
FIFO it greedily executes the whole thing. Before this commit,
Finish and Token interrupts would happen instantly, not even
taking into account how long the current FIFO window has
taken to execute. The interrupts would be effectively backdated
to the start of this execution window.
This commit does two things: It pipes the current FIFO window
execution time though to the interrupt scheduling and it enforces
a minimum delay of 500 cycles before an interrupt will be fired.
Being able to preserve the address register is useful for the
next commit, and W0 is the address register used for loads. Saving
the address register used for stores, W1, was already supported.
If a host register has been newly allocated for the destination
guest register, and the load triggers an exception, we must make
sure to not write the old value in the host register into ppcState.
This commit achieves this by not marking the register as dirty
until after the load is done.
This does this following things:
- Default to the runtime automatic number of threads for pre-compiling shaders
- Adds a distinct automatic thread count computation for pre-compilation (which has less other things going on
and should scale better beyond 4 cores)
- Removes the unused logical_core_count field from the CPU detection
- Changes the semantics of num_cores from maximaum addressable number of cores to actually available CPU cores
(which is also how it was actually used)
- Updates the computation of the HTT flag now that AMD no longer lies about it for its Zen processors
- Background shader compilation is *not* enabled by default
Removed useless locks to DeviceContainer::m_devices_mutex, as they were all already protected by m_devices_population_mutex.
We have no interest in blocking other threads that were potentially reading devices at the same time so this seems fine.
This simplifies the code, and I've adjusted a few comments which mentioned possible deadlock that should now be totally gone.
The deadlock could have happen if a thread directly called EmulatedController::UpdateReferences(), while another another thread also reached EmulatedController::UpdateReferences() within a call to ControllerInterface::UpdateDevices(), as the mentioned function locked both the DeviceContainer::m_devices_mutex and s_get_state_mutex at the same time.
The deadlock was frequent on game emulation startup on Android, due to the UpdateReferences() call in InputConfig::LoadConfig() and the UI thread triggering calls to ControllerInterface::UpdateDevices().
It could also have happened on Desktop if a user pressed "Refresh Devices" manually in the UI while the input config was loading.
Also brought some UpdateReferences() comments and thread safety fixes from https://github.com/dolphin-emu/dolphin/pull/9489
This commit changes the default value of Fast Texture Sampling to true, and also moves the setting that controls it to the experimental section of the advanced tab. This is its own commit so that it can be easily reverted when we want to default to Manual Texture Sampling.
Co-authored-by: JosJuice <josjuice@gmail.com>
Specifically, when using Manual Texture Sampling, if textures sizes don't match the size the game specifies, things previously broke. That can happen with custom textures, and also with scaled EFB copies at non-native IRs. It breaks most obviously by not scaling the texture coordinates (so only part of the texture shows up), but the hardware wrapping functionality also assumes texture sizes are a power of 2 (or else it will behave weirdly in a way that matches how hardware behaves weirdly). The fix is to provide alternative texture wrapping logic when custom texture sizes are possible.
Note that both GLSL and HLSL provide a fwidth (fragment width) function defined as `fwidth(p) = abs(dFdx(p)) + abs(dFdy(p))`. However, it's easy enough to implement this ourselves (and it makes the code a bit more obvious).
The benefit to exposing this over the raw BP state is that adjustments Dolphin makes, such as LOD biases from arbitrary mipmap detection, will work properly.
These messages apply to the User directory regardless of
whether it's global or local, so we shouldn't specify "global".
Also changing "directory" to "folder", just for consistency
with "GC folder" in the same sentence.
We implement this by first rounding to nearest integer using the current
rouding mode, then converting this value from floating point to an integral
value.
Prefer using eax to isolate the sign bit. This saves a byte when the
destination ends up as r8-15, because those require a REX prefix.
Before:
41 8B C5 mov eax,r13d
41 C1 ED 1F shr r13d,1Fh
44 03 E8 add r13d,eax
41 D1 FD sar r13d,1
After:
41 8B C5 mov eax,r13d
C1 E8 1F shr eax,1Fh
44 03 E8 add r13d,eax
41 D1 FD sar r13d,1
This function was deprecated in ffmpeg in January[1], while its
replacement got introduced in 2015[2], so now might be the time to start
using it in Dolphin. :)
[1] f7db77bd87
[2] a9a6010637
This moves the only direct call to zlib’s crc32() into its own
translation unit, but that operation is cold enough that this won’t
matter in the slightest. crc32_z() would be more appropriate, but
Android has an older zlib version…
This was originally intended to fix https://bugs.dolphin-emu.org/issues/12717 but this ended up not being the issue (instead it seems like files just weren't recompiled when imgui was updated due to MSVC weirdness). Still, using brackets instead of quotes is preferable as this is a library include.
The reload stub is at a fixed address (0x80001800) so its hook flag
should be HookFlag::Fixed.
Otherwise the hook is installed by HLE::PatchFixedFunctions but
immediately removed by HLE::PatchFunctions (which is called by
HLE::Reload right after PatchFixedFunctions).
Should fix https://bugs.dolphin-emu.org/issues/12716
This also may eventually allow loading patches from sources other than the 1:1 expected file structure host file system, such as memory or an archive file.
While trying to work on adding audiodump support for CLI, I was alerted that it was important to first try moving the DSP configs to the new config before continuing, as that makes it substantially easier to write clean code to add such a feature.
This commit aims to allow for Dolphin to only rely on the new config for DSP-related settings.
Frame Advance Speed hotkeys were swapped. This likely occurred because speed and delay are inverses (i.e. a speed increase should DECREASE the delay and vice versa).
This replaces the MAX_LOGLEVEL define with a constexpr variable
in order to fix self-comparison warnings in the logging macros
when compiling with Clang. (Without this change, the log level check
in the logging macros is expanded into something like this:
`if (LINFO <= LINFO)`, which triggers a tautological compare warning.)
GCC complains about float_emit being null when inlining
ByteswapAfterLoad into MMIOLoadToReg. ByteswapAfterLoad
does dereference float_emit, but only when passing FLAG_FLOAT,
which MMIOLoadToReg has an assert for and does not support.
Also cleaning up some unnecessarily specified namespaces while
I'm at it.
The compiler was loudly announcing each and every branch Tev was not checking in
a switch statement, but Tev has learned it's lesson and will produce that
warning no more.
Reusing the slowmem handlers of existing blocks meshes badly
with reusing the empty space left when destroying blocks.
I don't think reusing slowmem handlers was much of a gain anyway.
This is done entirely through interpreter fallbacks. It would
probably be possible to implement this using host exception
handlers instead, but I think it would be a lot of complexity
for a rarely used feature, so let's not do it for now.
For performance reasons, there are two settings for this feature:
One setting which does enables just what True Crime: New York City
needs and one setting which enables it all. The latter makes
almost all float instructions fall back to the interpreter.
Instead of having a single GUI checkbox for "Always Hide Mouse Cursor",
I have instead opted to use radio buttons so the user can swap between
different states of mouse visibility. "Movement" is the default
behavior, "Never" will hide the mouse cursor the entire time the game is
running, and "Always" will keep the mouse cursor always visible.
Previously the unhide of movement mouse_timer reset occurred within case MouseButtonPress.
Additionally, there was a redundant expression in the if statement for cursor locking.
Now works with games that deliberately avoid invalidating TMEM because
they know textures are too large to fit:
* Sonic Riders
* Metal Arms: Glitch in the System
* Godzilla: Destroy All Monsters Melee
* NHL Slapshot
* Tak and the Power of Juju
* Night at the Museum: Battle of the Smithsonian
* 428: Fūsa Sareta Shibuya de
There are two reasons for this.
1. Using Dolphin's logging system lets the user decide whether
the printout should go to the terminal, the GUI, or a file.
fmt::print always prints to stdout... unless you're on Android, in
which case it does nothing at all, because Android disables stdout.
2. The Windows version of Dolphin crashes when you use fmt::print.
Yes, really. The crash happens because a call to std::fprint in
fmt::v7::detail::fwrite_fully returns that less characters were
written than requested, which fmt handles by throwing an exception.
(As always, Dolphin does not use exception handling.)
I'm not sure why std::fprint is doing this, but since switching
away from using fmt::print is a good idea due to the previous point
anyway, I'd say it's best to just switch.
Previously, if you have "Hotkeys Require Window Focus" disabled, you could repeatedly use the "Open" hotkey, for example, to stack File Open windows over top of each other over and over.
This commit allows the hotkey manager to disable/enable on QFileDialog creation and destruction.
Currently the logic for addressing the individual TexUnits is splattered all
across dolphin's codebase, this commit attempts to consolidate it all into a
single place and formalise it using our new TexUnitAddress struct.
MappingWindow is modal, yet the user can use hotkeys while the window is active. I believe hotkeys should not be recognized while this window is active.
This string is extremely likely to be mistranslated without the
proper context. Actually, it's probably impossible to translate
this string in a good way to some languages, but I'm not sure how
to solve that. Let's at least add an i18 comment for now.
PR #10066 added functionality to call std::abort when a panic alert occurs; however, that PR only implemented it for MsgAlert and not MsgAlertFmtImpl, meaning that the functionality was not used with PanicAlertFmt (only PanicAlert, which is not used frequently).
Yes, that's right! It's time to add even more NKit warnings,
because users still don't understand what NKit is or how it works!
More specifically, some users seem to be under the impression that
converting an NKit file to for instance RVZ using Dolphin's convert
feature will result in a normal RVZ file, when it in fact results in
an NKit RVZ file (since NKit is not a container format in the sense
that GCZ/WIA/RVZ/WBFS/CISO is, but rather a kind of trimmed ISO).
I can hardly blame users for not knowing this, because it's not
intuitive unless you know the technical details of how NKit works.
Previously, s_temp_input was being used for BOTH the savestate's and the movie's input printout in the panic alert.
This commit simply performs memcpy from the correct vector for the movInput printout.
Previously, the file dialog window was ambiguous between saving or loading a .dtm. This commit simply gives a bit more context to differentiate the two windows.
Previously, when playing back a movie, you could not see the total frame count of a movie, only the total number of input polls.
This change simply shows the total frame count on movie playback.
Note that this change also results in the framecount and framecount total ALWAYS being displayed if show_movie_window is true, regardless of whether or not m_ShowFrameCount is true. I believe this is fine, as TASers are much more likely to reference the framecount than the input poll count.
Previously, only the number of total input polls would be shown in the window title when playing back a movie. This simply adds the VI / frame count total as well, which is a much more relevant number to look at while TASing.
If this commit is not applied, then the previous commit will cause hotkeys to be saved if there is a syntax error when hitting "OK" and the user presses the X to close the window.
This commit only applies changes to hotkey config if no syntax error occurs.
Previously you could type whatever gibberish you wanted into the formula bar, press OK, and receive a modal syntax error window. Closing the syntax error window would cause the hotkey config window to close as well, and your gibberish would be applied to the hotkey assignment.
This commit requires that a syntax error does not occur before accept() is called.
Previously, using TAS Input to activate the digital L and R buttons would not show these inputs in the Input Display. This commit adds the digital L and R presses to the Input Display, and also displays just "L" or "R" if the analog is set to 255.
There are certain hotkeys that we absolutely want to be able to use
without being in-game. Presently, no hotkeys are recognized unless we
are in-game.
I've identified and moved the following hotkeys to be checked before the
HotkeyScheduler checks to see if the Core is running:
- Open
- Exit
- Start Recording
- Refresh Game List
Note that Play Recording should also be implemented here, however it
looks like there is no signal for a PlayRecording() function, so this
will have to be handled in a later PR once that signal is created and
implemented.
Now that we have enum helpers for inserting values into packets and have
migrated all other enumerations over, there's no need to keep this alias
around any longer.
Previously, it was not clear where the boundary of the StickWidget was when interacting outside of the circle. This aims to restore the gray square present in the Wx-era.
Over time OnData() has become a huge function-long case statement that
attempts to manage numerous packet-related behaviors, which makes it a
little difficult to reliably ensure certain handling doesn't interfere
with another case's. It's also mildly annoying to navigate due to its
size.
To make it a little easier to read and find the specific behavior, we
can break the relevant pieces of code out into their own functions.
When RenderDoc is attached, wglShareLists fails for some reason (see baldurk/renderdoc#2361). wglCreateContextAttribsARB has a parameter for the share context, so there's no reason to use a separate wglShareLists call.
Co-authored-by: baldurk <baldurk@baldurk.org>
Previous code from #7950 only clamps correctly when the efb copies
left and top coordinates are (0, 0)
Now we should handle all situations.
Spyro: A hero's tail is an example of a game that does an oversized
EFB copy with a non-zero origin.
If W0 is locked when fpr.RW is called, the indirectly called
ConvertSingleToDoubleLower may need to emit a push+pop, so it's
better for fresx/frsqrtex to call RW before locking W0 than after.
This way the address check will take up less icache (since it's
only emitted once for each routine rather than once for each
psq_st instruction), and we also get address checking for psq_l.
Matches Jit64's approach.
The disadvantage: In the slowmem case, the routines have to
push *every* caller-saved register onto the stack, even though
most callers probably don't need it. But at long as the slowmem
case isn't hit frequently, this is fine.
In the case of the JitAsm routines, we can't actually use
backpatching. Still, I would like to gather all the load and
store instructions in one place to make future changes easier.
This adjusts the NaN replacement logic introduced in #9928 to work around the HLSL compiler optimizing away calls to isnan, which caused that functionality to not work with ubershaders on D3D11 and D3D12 (it did work with specialized shaders, despite a warning being logged for both; that warning is also now gone). Note that the `D3DCOMPILE_IEEE_STRICTNESS` flag did not solve this issue, despite the warning suggesting that it might.
Suggested by @kayru and @jamiehayes.
This is a proper fix for the issue that 3071a1d was a workaround for.
It wasn't some kind of bug in the register cache that had laid dormant,
it was a simple mistake made in b24b79e.
Fixes a regression from ecf86bb.
The GPR allocation_order is initialized with only 28 elements,
so the 29th element ends up getting zero initialized.
Very sneaky bug...
Previously in Read_U64 and Write_U64 the value that was read or written
would be truncated to a 32-bit value before being passed off to the
memcheck handler, which can result in incorrect values being logged out.
Lets us simplify SDRUpdated() a little bit.
This also fixes the layout of UReg_SDR1. Turns out this struct has been
incorrect (from a little-endian perspective) the entire time and went
unnoticed, since the union was never used.
These are trivial to resolve.
Converting the structure member into a u32 results in no increase in
structure size, as it's making use of the three extra padding bits in
the structure.
On a real Wii, these constants are normally written by the system menu
(maybe even as part of common SDK code?)
However, they're cleared by IOS whenever a PPC title is launched.
IOS memsets 0x0-0x3fff and then manually writes some constants
in low MEM1. PR #4723 added most of the writes in the 0x31xx region
but left out the four writes to the legacy constant region.
Previously Dolphin didn't actually clear 0-0x3fff so those constants
would stick around after a system menu execution.
011f7789e0 exposed those missing writes.
Prompted by https://dolphin.ci/#/builders/24/builds/985
A 1-character typo in a recent PR caused FifoCI builds to break
horribly and spew millions of panic alerts until buildbot crashed.
This PR adds a new config option -- defaulting to off -- that allows
Dolphin to abort early on when a panic alert occurs instead of
continuing forever.
Optimize division by a constant into multiplication. This method is
also used by GCC and LLVM.
We also add optimized paths for divisors 0, 1, and -1, because they
don't work using this method. They don't occur very often, but are
necessary for correctness.
Makes the enum strongly typed instead of interacting with a raw u32
value. While we're at it, we can add helpers to the NWC24Config to make
using code poke at the internals of the class a little bit less and also
make the querying a little nicer to read.
Currently we were using heap allocating maps that last for the entire
duration of the emulator running.
Given the size N of both of these maps are very small (< 20 elements),
we can just make use of an array of pairs and perform linear scans. This
is also fine, given this code isn't particularly "hot" either, so this
won't be run often.
It seems like we spend a lot of the game list scanning time in
updateAdditionalMetadata, which I suppose makes sense considering
how many different files that function attempts to open.
With the addition of just one little atomic operation, we can make
it safe to call updateAdditionalMetadata without holding a lock.
These functions don't touch any class state, so they can be turned into
internal helper functions.
While we're at it, we can move the enumerations as well.
Although it's not clear what the xA and xB conditions are intended to do, the pattern indicates that xB is the regular version and xA is the inverted version, so for consistency, IsConditionB should be the main function.
Since the merge of b24b79e, we've gotten reports that the
following games are broken on JitArm64:
* Sonic Heroes
* The SpongeBob SquarePants Movie
* Astérix & Obélix XXL
* The Incredibles: Rise of the Underminer
Disabling the register cache avoids the issue, so the cause
of the bug might not actually have anything to do with the
newly implemented instructions. Nevertheless, I don't want
to ship a beta with this problem present, so I would like to
disable these instructions for the time being.
HandleFastmemFault works correctly when faults only happen in
expected locations, but it does some things that are rather
dangerous for faults in unexpected locations, like decrementing
an iterator without checking whether it's equal to begin.
This change cleans up the logic by making m_fault_to_handler's
key be the end of the fastmem region instead of the start.
Hardware testing indicated that SRS uses a different list of registers than LRS (specifically, acS.h can be used with SRSH but not LRS, and SRS does not support AX registers, and there are 2 encodings that do nothing).
* DSP*Arithmetic: Fix grammar for ANDCF and ANDF
* DSP*Arithmetic: Fix registers used by MOVAX and MOV
* DSP*Branch: Fix documentation for JMPR
* DSP*Branch: Fix HALT encoding ("I think I saw a two")
* DSP*ExtOps: Fix 'LN encoding (The listed encoding was for 'L)
* DSP*ExtOps: Improve documentation for 'LD and 'LDAX
* DSPJitExtOps: Correct typo
* DSP*LoadStore: Remove obsolete comment about pc in SRS (This was fixed in 1419e7e5b2)
* DSP*LoadStore: Fix comments for LRR/SRR
* DSP*Misc: Improve documentation for SBCLR and SBSET
* DSP*Multiplier: Fix MULXAC encoding (The previous encoding was for MULXMVZ)
* DSP*Multiplier: Fix tabs in MULCAC and MULCMVZ (There are some other tabs in comments in the JIT, but these are the only ones that are in instruction comments instead of indicating the corresponding interpreter code. Those other comments can be corrected in a different PR, as they're not documentation related.)
* DSPJitMultiplier: Fix MULXMVZ typo
These instructions were already implememented by Dolphin, but never added to the manual. Extension instructions will be handled in a later commit, as wlil instructions that were not previously implememented by Dolphin.
We were using a "value" register to avoid clobbering physical_addr,
but this isn't actually needed anymore. The only bits we need from
physical_addr after we start clobbering it are bits 5-9, and
those bits are identical in effective_addr and physical_addr,
so we can read them from effective_addr instead.
Fixes https://bugs.dolphin-emu.org/issues/12620
The changed code did not match the corresponding code in VertexShaderGen. Some parts of the sky have 2 color channels in each vertex, while others only have 1, despite only color channel 0 being used and XFMEM_SETNUMCHAN being set to 1 for both of them. The old code (from #4601) caused channel 0 to be set to channel 1 if the vertex contained both color channels but the number of channels was set to 1, which is wrong.
Makes our conversions between the different signs explicit to indicate
that they're intentional and also silences compiler warnings when
compiling with sign conversion or stricter truncation warnings enabled.
The extension needs to happen in SetLongAcc, not GetLongAcc, as the extension needs to always be reflected in acS.h.
There is no functional difference with the write handler for acS.h, but it is more readable than 4 casts in a row.
`IsLess` would incorrectly return true if both `SR_OVERFLOW` and `SR_SIGN` are set, as `(sr & SR_OVERFLOW) != (sr & SR_SIGN)` becomes `SR_OVERFLOW != SR_SIGN` which is true as the two masks are different. This broke in e651592ef5.
This issue only affected the DSP LLE Interpreter, and not the DSP LLE JIT.
I've also included a simple test case for this. `ax0.l` (on the top left) is set to 0 if the instruction following `IFL` does not execute and to 1 if it is executed.
Retail-signed discs use the format: IOS56-64-v5661.wad
Debug-signed discs use the format: firmware.64.56.22.29.wad
Debug-signed discs usually have a 128 version of the firmware as well,
since some devkits have 128 MB MEM2. (Retail has 64 MB.)
I found it a little bit annoying that you can't start typing
the desired address immediately after opening the window.
Also getting rid of the window's ? button while I'm at it.
When you come across a cheat code in a place like the Dolphin
wiki, it's often posted like this:
$16:9 Widescreen
0441187C 3FE38E39
Sometimes users try to paste this in its entirety into the Code
field, which leads to Dolphin reporting an error on the first line.
I think it would be nice to make this a little smoother by having
Dolphin accept having a first line that starts with $.