These are only ever used with ShaderCode instances and nothing else.
Given that, we can convert these helper functions to expect that type of
object as an argument and remove the need for templates, improving
compiler throughput a marginal amount, as the template instantiation
process doesn't need to be performed.
We can also move the definitions of these functions into the cpp file,
which allows us to remove a few inclusions from the ShaderGenCommon
header. This uncovered a few instances of indirect inclusions being
relied upon in other source files.
One other benefit is this allows changes to be made to the definitions
of the functions without needing to recompile all translation units that
make use of these functions, making change testing a little quicker.
Moving the definitions into the cpp file also allows us to completely
hide DefineOutputMember() from external view, given it's only ever used
inside of GenerateVSOutputMembers().
A very trivial conversion, this simply converts calls to Write over to
WriteFmt and adjusts the formatting specifiers as necessary.
This also allows the const char* parameters to become std::string_view
instances, allowing for ease of use with other string types.
The include for X11Utils.h (and by extension Xlib.h) is gated behind
HAVE_XRANDR, as well as the declaration for this function, but its
definition was mistakenly gated behind HAVE_X11. Therefore, if we have
X11 but not Xrandr, the build will fail due to declaration/definition
mismatch and the missing Window type.
CopyNandFile must not create empty files on the destination filesystem
if the source file doesn't exist.
Otherwise, this can lead to an empty Mii database being created in the
session Wii root if there's no database in the configured Wii root and
netplay or Movie is used -- that database would then be copied back to
the configured root, which causes games like MKW to complain about
corrupted Mii data even when the player has stopped using netplay.
This commit also simplifies CreateFullPath usage.
There's no need to manually extract the directory from the path,
FS::CreateFullPath does it automatically just like File::CreateFullPath
Remove the warning:
warning: offsetof within non-standard-layout type ‘JitBlock’ is conditionally-supported
JitBlock contains non-trival types now. Split the fields with trival
types that needs to be access from JIT code into JitBlockData structure.
`std::abs(x - y)` where x and y are unsigned integers fails to compile
with an "call of overloaded 'abs(unsigned int)' is ambiguous" error
on GCC, and even if it did compile, that expression still wouldn't
give the correct result since `x - y` is unsigned.
Lambda expressions with uncaptured constants were leading to errors,
and there were also some warnings about deprecated functions
(QFontMetrics::width and inet_ntoa).
Panic alerts don't use fixed width fonts, and translators are
unlikely to preserve the exact spacing unless they are given
specific instructions to do so and are willing to fight against
the Transifex interface a bit.
We must not provide the /Externals directory as global include directory.
Here, this yield a crash because of external minizip header and system library mismatch.
Soundtouch itself recormends to include it with <SoundTouch.h> and -I/usr/include/soundtouch, so this should fit better.
It actually maps to postMtxInfo, not posMtxInfo (which isn't a thing).
This is especially confusing because there *are* position matrices (as
opposed to post-transform matrices).
Changed several enums from Memmap.h to be static vars and implemented Get functions to query them. This seems to have boosted speed a bit in some titles? The new variables and some previously statically initialized items are now initialized via Memory::Init() and the new AddressSpace::Init(). s_ram_size_real and the new s_exram_size_real in particular are initialized from new OnionConfig values "MAIN_MEM1_SIZE" and "MAIN_MEM2_SIZE", only if "MAIN_RAM_OVERRIDE_ENABLE" is true.
GUI features have been added to Config > Advanced to adjust the new OnionConfig values.
A check has been added to State::doState to ensure savestates with memory configurations different from the current settings aren't loaded. The STATE_VERSION is now 115.
FIFO Files have been updated from version 4 to version 5, now including the MEM1 and MEM2 sizes from the time of DFF creation. FIFO Logs not using the new features (OnionConfig MAIN_RAM_OVERRIDE_ENABLE is false) are still backwards compatible. FIFO Logs that do use the new features have a MIN_LOADER_VERSION of 5. Thanks to the order of function calls, FIFO logs are able to automatically configure the new OnionConfig settings to match what is needed. This is a bit hacky, though, so I also threw in a failsafe for if the conditions that allow this to work ever go away.
I took the liberty of adding a log message to explain why the core fails to initialize if the MIN_LOADER_VERSION is too great.
Some IOS code has had the function "RAMOverrideForIOSMemoryValues" appended to it to recalculate IOS Memory Values from retail IOSes/apploaders to fit the extended memory sizes. Worry not, if MAIN_RAM_OVERRIDE_ENABLE is false, this function does absolutely nothing.
A hotfix in DolphinQt/MenuBar.cpp has been implemented for RAM Override.
Utilizing constexpr, we can eliminate the need to construct the tables
at runtime and just do all the work at compile-time. Making for less
moving parts overall.
The general structure is more or less the same, however rather than one
single initialization function, each table is built off an immediately
executed lambda function. This is nice, since it narrows the scope of
the table building logic down to the tables that actually need it.
While having motion control emulation of IR enabled by default
makes sense in situations like using a DualShock 4 on a PC,
Android has the additional option of touch emulation of IR
which seems to be better liked, and the default value which
was chosen with PC in mind was carried over to Android
without any particular consideration. This change disables
motion control emulation of IR by default on Android only.
The code was actually already rather well adapted for this.
We more or less just have to skip ParseDisc and run
ParsePartitionData directly. This required the PartitionHeader
struct to be removed (which wasn't that useful anyway).
If we start 31 KiB into a 32 KiB block and want to mark 2 KiB
of data as used, we need to mark 2 blocks as used, not just 1.
This problem is avoided when calling MarkAsUsed from
MarkAsUsedE, since MarkAsUsedE aligns to 32 KiB on its own.
Most calls to MarkAsUsed are from MarkAsUsedE, which is why
this hasn't been a noticeable problem in the past.
The constant DESIRED_BUFFER_SIZE was determined by multiplying the
old hardcoded value 32 with the default GCZ block size 16 KiB.
Not sure if it actually is the best value, but it seems fine.
Similar to what we do for addx. Since we're calculating b - a and
because subtraction is not communitative, we can only apply this when
source register a holds the constant.
Before:
45 8B EE mov r13d,r14d
41 83 ED 08 sub r13d,8
After:
45 8D 6E F8 lea r13d,[r14-8]
We can get away with skipping the addition when we know we're dealing
with a constant zero. Just a MOV will suffice in this case.
Once again, we don't bother to add separate handling for when overflow
is needed, because no titles would ever hit that path during my testing.
Before:
8B 7D F8 mov edi,dword ptr [rbp-8]
83 C7 00 add edi,0
After:
8B 7D F8 mov edi,dword ptr [rbp-8]
ADD has a smaller encoding for immediates that can be expressed as an
8-bit signed integer (in other words, between -128 and 127). MOV lacks
this compact representation.
Since addition allows us to swap the source registers, we can always get
the shortest sequence here by carefully checking if we're dealing with a
small immediate first. If we are, move the other source into the
destination and add the small immediate onto that. For large immediates
the reverse is preferrable.
Before:
41 BE 40 00 00 00 mov r14d,40h
44 03 75 A8 add r14d,dword ptr [rbp-58h]
After:
44 8B 75 A8 mov r14d,dword ptr [rbp-58h]
41 83 C6 40 add r14d,40h
Before:
44 8B 7D F8 mov r15d,dword ptr [rbp-8]
41 81 C7 00 68 00 CC add r15d,0CC006800h
After:
41 BF 00 68 00 CC mov r15d,0CC006800h
44 03 7D F8 add r15d,dword ptr [rbp-8]
When the source registers are a simple register and a constant zero and
overflow isn't needed, emitting LEA is kinda silly.
This will occasionally save a single byte for certain registers due to
how x86 encoding works. More importantly, LEA takes up execution
resources while MOV does not.
Before:
41 8D 7D 00 lea edi,[r13]
After:
41 8B FD mov edi,r13d
When the destination register matches a source register, the other
source register contains zero, and overflow isn't needed, the
instruction becomes a nop and we don't need to emit anything.
We could add specialized handling for the case where overflow is needed,
but none of the titles I tried would hit this path.
Before:
83 C7 00 add edi,0
After:
No functional change, just simplify some repeated logic in the case
where we're dealing with exactly one immediate and one simple register
when overflow isn't needed.
On some platforms (like Windows), the temporary file must be closed
before it can be renamed.
I guess nobody noticed this for so long because (1) the FS code has a
failsafe for missing FST entries (because existing users do not have
a FST), and most games do not care about file metadata;
(2) the write failures can only be seen in the logs.
Because we don't want this to break, I have turned the ERROR_LOGs into
PanicAlerts.
The threads can't actually be started when determinism is enabled, as
the behavior would not be deterministic, but Open() still tries to
start the threads and wait, resulting in a deadlock when booting
certain games and homebrew in NetPlay.
https://bugs.dolphin-emu.org/issues/11997
The problem seemed to be that s_DILENGTH would get set to 0
at times when it shouldn't. Simply not changing it in case
of NoReply or DTK seems to fix the problem. However, we can
actually go one step further in accuracy and use data.size()
to change s_DIMAR and s_DILENGTH as partial reads (NoReply
commands) complete, instead of jumping directly to 0 when
the whole read completes.
It used to be the case that frame advance skipped duplicate frames
(i.e. it would take 30 frame advances to get through one second
of emulated time in a 30 fps game), but this broke in 9c5c3c0.
Skipping duplicate frames making TASing less annoying.
Since we are calling this off the UI thread, we can't use anything which
accesses the underlying NSView object. We create and set the Metal layer
on the UI thread before the video backend is initialized. This extension
is both compatible with MoltenVK and gfx-portability for accepting a
layer at surface creation.
This fixes a bug where pressing Enter in the "Do you want to stop the
current emulation?" confirmation popup also triggers a KeyRelease in
GameList, which starts a new game.
Also fixes the same crash when accessing the game details
(which only can be accessed after long pressing a game).
The problem was that we were not using a theme that had
an AppCompat theme as a parent.
Unfortunately, the game details dialog uses white on white on
Android TV, and I don't know how to fix this in a clean way.
Hardware tests have shown that if the number of texgens/channels do not
match, you get garbage rendering. Presumably because the output
registers from the XF stage are fed into the incorrect input registers
for TEV/BP.
Currently, this causes Dolphin to crash/generate invalid shaders with an
assertion failure in the hardware backends. Instead, we log an error.
Perhaps in the future we should just spit out all texgens/colors anyway
from both stages, and let cross-stage optimization take care of DCE'ing
it away. But doing so would require changing the UIDs and invalidating
everyone's shader caches.
This fixes a crash on ATV devices, because the the AlertDialog is
from the appcompat class, but the theme derived from the parent
view on ATV devices isn't from AppCompat.
ROM_BASE is 0, and address is unsigned. It is always
true that address >= 0. So just compare with ROM_SIZE
and don't use IN_RANGE macro to avoid the warning.
For some reason every button group was shoved into the same config
group called "Keys" which caused buggy behavior once someone tried to
have hotkeys with the same name in different groups. This fixes that,
and converts existing configs to the new group names so they don't get
reset.
When booting a Wii game, Dolphin can overwrite certain settings
in the SYSCONF file, such as turning off PAL60 for NTSC games.
Normally, these settings get reverted at the end of emulation, but
this does not happen if Dolphin crashes or force quits in some other
way. (Personally, I have a tendency to use Visual Studio's Stop
Debugging button, which kills the process...)
Dolphin also overwrites certain values in setting.txt when booting
a Wii game. Unlike with SYSCONF, we currently make no effort to
preserve the original values in this file.
This change fixes both of these problems by copying SYSCONF and
setting.txt to the Backup folder when booting a Wii game, and then
copying them back either when launching Dolphin (in case the
previous run of Dolphin crashed) or when ending emulation.
This commit attempts to improve error handling for device opening by
reducing panic alert spam when opening one or several devices fails.
Currently, Dolphin shows a panic alert for every device that we fail
to open, and another panic alert at the end if no usable device was
found. That is certainly a bit excessive -- we should only keep the
very last panic alert (the one that is shown if everything fails)
and we can just put the error for the last device open operation there.
This also changes the PanicAlert to a CriticalAlert to ensure the
message is visible even if the user has disabled regular panic alerts.
The message has also been reworded and should hopefully be clearer.
This is related to https://bugs.dolphin-emu.org/issues/10958 which
uses Qt to clear out the window so the game list isn't displayed
while the core is booting. Instead, we use the video backend to
render a black screen, which means Qt doesn't have to flip between
paint engines.
Without included header build fails on gcc-10 as:
```
[ 52%] Building CXX object Source/Core/Core/CMakeFiles/core.dir/DSP/DSPTables.cpp.o
../../../../Source/Core/Core/DSP/DSPTables.cpp: In function 'const char* DSP::pdname(u16)':
../../../../Source/Core/Core/DSP/DSPTables.cpp:492:3: error: 'sprintf' was not declared in this scope
492 | sprintf(tmpstr, "0x%04x", val);
| ^~~~~~~
```
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Without included header build fails on gcc-10 as:
```
[ 13%] Building CXX object Source/Core/AudioCommon/CMakeFiles/audiocommon.dir/CubebUtils.cpp.o
In file included from ../../../../Source/Core/AudioCommon/CubebUtils.cpp:13:
../../../../Source/Core/Common/StringUtil.h: In function 'bool TryParse(const string&, T*)':
../../../../Source/Core/Common/StringUtil.h:84:20: error: 'numeric_limits' is not a member of 'std'
84 | if (value < std::numeric_limits<LimitsType>::min() ||
| ^~~~~~~~~~~~~~
```
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
On Windows, Qt's default system font (MS Shell Dlg 2) is outdated.
Dolphin previously used over 15 lines of code to compute a font
closer to the proper font, but with an approximately correct font size.
Using the QMenu font directly is both more concise and more elegant
(in my opinion).
So far in all our uses of ScopeGuard, the type of the callable is
usually just a lambda or a function pointer, so there is no need
to rely on std::function's type erasure.
While the cost of using std::function is probably negligible, it still
causes some unnecessary overhead that can be avoided by making
ScopeGuard a templated class. Thanks to class template argument
deduction in C++17 most existing usages do not even need to be changed.
See https://godbolt.org/z/KcoPni for a comparison between
a ScopeGuard that uses std::function and one that doesn't
Some locales (e.g. fr_FR.UTF-8 on ArchLinux) don't split the string stream on a space. As such, when extracted formatted data from te stream, it will return the two numbers as one for the first call, effectively overflowing the u32 variable, then will do an out-of-bounds read for the second call. Forcing the use of the C locale on the streams where it would cause a problem allows to workaround this behavior.
It seems that the newer version of fmt gets tripped up by bitfields
within structs. However, we can just specify the intended type where
necessary to get around this.
This changes channel syncing to happen when the operating system is
Android TV rather than when TvMainActivity is launched. (You can run
TvMainActivity on a phone by specifying a launch activity manually
in Android Studio, which I do sometimes for testing purposes. Without
this change, you get an exception when channel syncing runs.)
This fixes CreateFullPath to not create directories when it is known
that they already exist, instead of calling CreateDirectory anyway
and checking if the error is AlreadyExists. (That doesn't work
now that we have an accurate implementation of CreateDirectory
that performs permission checks before checking for existence.)
I'm not sure what I was thinking when I wrote that function.
Also adds some tests for CreateFullPath.
Fixes a regression from #8539.
CreateDirectory was the correct function to use for creating
directories since parent directories already exist and are
not owned by the system menu.
This is only used on Apple and Unix-like machines, so we can enclose the
prototype with an ifdef like the implementation is. This prevents
false-positives about an unimplemented function prototype.
Files cannot be given a different file name, only moved across
directories.
Add a test for that behaviour and fix the existing
RenameWithExistingTargetFile test.
Now that all FS functions that create new inodes are properly
implemented, we can make GetMetadata actually return correct file
metadata rather than giving fixed information. The hack for the DQX
installer can also be removed now since our ES and FS keep track of
caller UID/GIDs now.
With the CreateFile/CreateDirectory fix and this commit, we can
finally return correct results in ReadDirectory and the sorting
hack -- whose purpose was to prevent certain versions of the
System Menu from crashing -- can be removed too.
CreateDirectory does not create missing parent directories. If that
behaviour is desired, CreateFullPath should be used instead.
(These small misuses went unnoticed since the previous implementation
of CreateDirectory automatically created parent directories.)
Previously, the FS root directory would get created as a side
effect of calling CreateDirectory during boot (since the
implementation was sloppy and used File::CreateFullDir).
Since CreateDirectory no longer does that, it is necessary to ensure
that the FS root directory does exist by creating it explicitly.
Some official titles rely on implementation details of Nintendo's
FS sysmodule and will not work properly if those are changed.
Notably, some games and older versions of the System Menu appear
to be relying on the order of files returned by FS::ReadDirectory
and will either fail to find their save data (for Bolt) or
outright crash (for the System Menu).
Some titles also actually expect filesystem metadata to be correct.
One title that has been confirmed to do this is DQX, which generates
paths based on the GID of files within its own title directory.
While it is easy to make workarounds for these issues -- and in fact
we already do have some for the sysmenu and DQX, having hacks
is obviously nonideal and adding yet another hack would be required
to fix Bolt -- one that would be even uglier.
Furthermore, while it is currently unknown whether any official
title cares about permissions, the lack of FS metadata means that
we are unable to implement them if that turns out to be desirable
or necessary.
By adding a FST, we can implement things correctly and solve all
those problems without hacks.
Apart from DQX, the sysmenu and Bolt, this changeset also fixes
the Photo Channel complaining about corrupted system files
on the initial launch.
This first commit adds the basic structures and functions that
are necessary to load, save, query and update our version of the FST.
For simplicity, a binary format that is inspired from Nintendo's FST
structure was chosen for serialization. It is not expected to ever
receive an update.
PS: an update on the NAND image backend:
A long time ago I had planned to add another FS backend which would
be using a NAND image/blob as the storage. While I have already
written an implementation that has been tested, solves all the
aforementioned issues and more, produces images that are fully
compatible with IOS's FS driver, I feel like NAND images raise too
many issues: savestate sizes, code complexity and maintenance cost.
Since many fixes and additions that are part of that implementation
(e.g. FS timings, utility structures, FST) have already been merged
or will be submitted as part of this changeset, I will likely not
submit the branch.
Migrates the shader generator off the use of a global array, eliminating
the use of some global state. This also allows us to move the shader
generation over to using fmt in a subsequent change.
Add a function that safely returns whether a character is printable
i.e. whether 0x20 <= c <= 0x7e is true.
This is done in several places in our codebase and it's easy to run
into undefined behaviour if the C version defined in <cctype>
is used instead of this one, since its behaviour is undefined
if the character is not representable as an unsigned char.
This fixes MemoryViewWidget.
Otherwise, a line that's too wide for the log widget will cause the horizontal scroll bar to appear, which reduces the vertical height, and causes the most recent line to be off screen. Since that line is off screen, the log widget no longer scrolls as new lines appear, unless it's manually scrolled to the very bottom again.
This is an alternative to PR 8557 and PR 8558. The way this PR solves
the problem is essentially the same as what we had before PR 8394
(except the code we had back then only worked because it was broken).
Currently, we do not display every second frame in 25fps/30fps games
which run to vsync. This improves performance as there's less rendering
for the GPU to perform, but when combined with vsync, could cause frame
pacing issues.
This commit adds an option to force every frame generated by the console
to be displayed to the host, which may improve pacing for these games.
This works around Linux drivers for DS4 (Playstation 4) controllers splitting the device into three separate event nodes which makes configuration difficult.
To prevent collisions of input names in combined devices more descriptive names are now used when possible.
Due to the way the ModRM encoding works on x86, memory addressing
combinations involving RBP or R13 need an additional byte for an 8-bit
displacement of zero.
However, this was also applied in cases where it is unnecessary,
effectively wasting a byte.
- MatR with RSP or R12
8B 44 24 00 mov eax,dword ptr [rsp]
8B 04 24 mov eax,dword ptr [rsp]
- MRegSum with base != RBP or R13
46 8D 7C 37 00 lea r15d,[rdi+r14]
46 8D 3C 37 lea r15d,[rdi+r14]
- MComplex without offset
8B 4C CA 00 mov ecx,dword ptr [rdx+rcx*8]
8B 0C CA mov ecx,dword ptr [rdx+rcx*8]
Test the behavior of OpArg::WriteRest by using MOV with the various
addressing modes (MatR, MRegSum, etc.) in the source operand.
Both the instruction and the instruction length are validated.
This updates the lint script to require clang-format 9 and reformats
existing source code. Since VS2019 ships with clang-format 9 this
should make auto reformats less painful.
This also updates the clang-format configuration to set
BraceWrapping.AfterCaseLabel to true to ensure consistent brace
style; otherwise clang-format 9+ defaults to putting braces on
the same line as switch case labels.
Was checking over this old code, and saw a comment calling me out for a lack of documentation.
It might be half a decade late, but better late then never.
The old logic would always emit LEA when both sources are in a register
and OE is disabled. However, ADD is still preferable when one of the
sources matches the destination.
Before:
45 8D 6C 35 00 lea r13d,[r13+rsi]
After:
44 03 EE add r13d,esi
The ES sysmodule in IOS62 (v6430) has an exception for the
Wii U Transfer Tool in the SetUid function.
If the active title is the Wii U Transfer Tool, then calling SetUid
is always allowed. (The UID is still checked first, though.)
Fixes https://bugs.dolphin-emu.org/issues/10985
Partitions are Wii-exclusive, and don't happen at the DVDInterface level in
IOS. This isn't quite the cleanest fix, but it gets rid of the assumption that
a partition is open on starting the game at least.
The various ioctls sometimes have different arguments than the DI command
registers, though they generally overlap. There are also a bunch of ioctls
that don't even normally go into DVDInterface, just returning various data.
Some of the implemented ioctls are new to Dolphin.
A small, nonexhaustive set of warning fixes. The DiscIO Volume change
is a workaround for a GCC bug [1] that causes returning an unengaged
std::optional to emit annoying -Wmaybe-uninitialized warnings.
This last change alone fixes pages upon pages of warnings since
Volume.h is included from several files.
-Wstringop-truncation is another irrelevant warning for us, but
unfortunately there seems to be no way to disable it without
adding ugly pragmas wherever the warning appears.
- Refactor the Config::System::Main check so we check system once,
then we check for the section.
- Use an std::array<> instead of std::vector<>.
- Use an array of pointers instead of an array of ConfigLocation.
The latter contains two std::string objects, whereas pointers
are only 8 bytes (on 64-bit).
Code size comparison: (64-bit Linux, gcc-9.2.0, release build)
text data bss dec hex filename
16136 0 40 16176 3f30 IsSettingSaveable.cpp.o [before]
3933 720 0 4653 122d IsSettingSaveable.cpp.o [after]
-12203 +720 -40 -11523 -2d03 Difference
NOTE: The explicit std::string() conversions later are needed. Otherwise,
gcc-9.2.0 throws all sorts of errors because it can't find a matching
operator+() function.
"ppcState{}" is stored in the .data segment, which means the full ~4 MB
is stored in the executable.
"ppcState" is stored in the .bss segment, which means it only stores a
note that tells it to allocate and zero ~4 MB at runtime.
string_view is a thin wrapper around C strings, so it's more efficient
for constant strings than C++ strings.
The unordered_set<> also adds extra runtime overhead. For small arrays,
a simple linear search works. For larger arrays, std::binary_search()
works better than linear but without the unordered_set<> overhead.
ShouldBeDualLayer(): Removed a duplicate "SK8X52" entry.
This was a huge speedup with disabled fastmem, but it still requires the fastmem arena.
So let's disable it for now, even if this commit has a huge performance hit with disabled fastmem.