Turns out, Gamecube games actually do check DILENGTH, and if DILENGTH is at 0, they'll think the transfer completed successfully even if DEINT is used, since after all, surely that means everything was sent. That caused all sorts of issues, from audio looping when a disc is removed since it's re-using the same buffer to just flat-out crashing instead of showing the disc removed screen.
In particular:
- Trying to play audio in a non-ready state returns the state-specific error, not an audio buf error
- Audio status cannot be requested in non-ready states
- The audio buffer cannot be configured in states other than ReadyNoReadsMade
- Using the stop motor command while the motor is already stopped doesn't change states
Additionally, the internal state IDs are used (which distinguish ReadyNoReadsMade and Ready), instead of the state IDs exposed in request error. This makes some of the weird behavior a bit more obvious.
State and error behavior of the seek command was not implemented in this commit.
It is my opinion that nobody should use NKit disc images without
being aware of the drawbacks of them. Since it seems like almost
nobody who is using NKit disc images knows what NKit is (hmm, now
how could that have happened...?), I am adding a warning to Dolphin
so that you can't run NKit disc images without finding out about the
drawbacks. In case someone really does want to use NKit disc images,
the warning has a "Don't show this again" option. Unfortunately, I
can't retroactively add the warning where it's most needed:
in Dolphin 5.0, which does not support Wii NKit disc images.
Pretty much the same optimization we did for AVX, although slightly more
constrained because we're stuck with the two-operand instruction where
destination and source have to match.
We could also specialize the case where registers b, c, and d are all
distinct, but I decided against it since I couldn't find any game that
does this.
Before:
66 0F 57 C0 xorpd xmm0,xmm0
66 41 0F C2 C1 06 cmpnlepd xmm0,xmm9
41 0F 28 CE movaps xmm1,xmm14
66 41 0F 38 15 CC blendvpd xmm1,xmm12,xmm0
44 0F 28 F1 movaps xmm14,xmm1
After:
66 0F 57 C0 xorpd xmm0,xmm0
66 41 0F C2 C1 06 cmpnlepd xmm0,xmm9
66 45 0F 38 15 F4 blendvpd xmm14,xmm12,xmm0
AVX has a four-operand VBLENDVPD instruction, which allows for the first
input and the destination to be different. By taking advantage of this,
we no longer need to copy one of the inputs around and we can just
reference it directly, provided it's already in a register (I have yet
to see this not be the case).
Before:
66 0F 57 C0 xorpd xmm0,xmm0
F2 41 0F C2 C6 06 cmpnlesd xmm0,xmm14
41 0F 28 CE movaps xmm1,xmm14
66 41 0F 38 15 CA blendvpd xmm1,xmm10,xmm0
F2 44 0F 10 F1 movsd xmm14,xmm1
After:
66 0F 57 C0 xorpd xmm0,xmm0
F2 41 0F C2 C6 06 cmpnlesd xmm0,xmm14
C4 C3 09 4B CA 00 vblendvpd xmm1,xmm14,xmm10,xmm0
F2 44 0F 10 F1 movsd xmm14,xmm1
Fixes a critical regression where 95945a0 made us unable to
start emulation on Android 10 and newer. Android is restricting
direct access to /dev/ashmem starting with the new SDK version,
but we can use the new (and simpler) ASharedMemory API instead.
We have to keep using the /dev/ashmem approach on old versions
of Android, though.
On Linux, if shared zlib is present, zlib.h is always available and -lz
links to zlib, even if you don't run find_package(ZLIB).
For some reason I have zlib installed on Windows (possibly from vcpkg),
so find_package(ZLIB) succeeds and ZLIB_FOUND is true.
When Dolphin uses shared zlib on Windows, the problem is that zlib.h
is not in the default include path, and the CMake target is called
ZLIB::ZLIB and there's neither a target nor a library called z.
However, both find_package(ZLIB) and add_subdirectory(Externals/zlib)
create a target called ZLIB::ZLIB, so I'll switch to that instead.
Hopefully this change doesn't break anyone's build.
This modifies GCMemcard::TitlePresent() to match my findings of how the GC BIOS and various games behave when you alter the fields in the directory entry.
It looks like for a save to be recognized by a game, the following have to be true:
- Game code and maker code must exactly match what the game expects.
- Filename is only checked up to the first null byte. All bytes afterwards can be whatever.
The BIOS itself does a full compare of the filename when checking for whether it should allow copying a file from one card to another, but behaves oddly in some cases when there's non-null bytes after the first null. See the big comment in `HasSameIdentity()` for details.
This could cause read errors if chunks were laid out a certain
way in the file and the whole chunk wasn't being read at once.
Should fix https://bugs.dolphin-emu.org/issues/12184.
I believe the value returned by value() resets when we call
setValue() with the maximum (due to auto-reset). I have been
unable to test this because I can't reproduce the issue, which is
described at https://bugs.dolphin-emu.org/issues/12158#note-9.
The functions with "UTF" in the name use "modified UTF-8" rather
than the standard UTF-8 which Dolphin uses, at least according
to Oracle's documentation, so it is incorrect for us to use them.
This change fixes the problem by converting between UTF-8 and
UTF-16 manually instead of letting JNI do it for us.
This function does *not* always convert from UTF-16. It converts
from UTF-16 on Windows and UTF-32 on other operating systems.
Also renaming UTF8ToUTF16 for consistency, even though it
technically doesn't have the same problem since it only was
implemented on Windows.
As a side effect of 9c5c3c0, Dolphin's frame counter was changed
to run at 60/50 Hz even if the game is running at a lower framerate
such as 30 fps. Since the TAS input turbo button functionality
toggled the state of a button every other frame as reported by
the frame counter, this change made the turbo button functionality
not work with 30/25 fps games.
I believe it would be hard to change the frame counter back to
how it used to work without undermining the point of 9c5c3c0,
and I'm not sure if doing so would be desireable or not anyway,
so what I'm doing instead is letting the user determine how long
turbo button presses should last. This lets users avoid the 30/25
fps game problem while also granting additional flexibility.
Perhaps there is some game where it is useful to mash at a speed
which is slower than frame perfect.
Also added a IsRunning function as it was impossible to know whether it had been started or not (I will use it in later PRs but it should be there anyway)
Currently, the touch controller overlay uses a square gate for
sticks. This commit changes that so that it instead uses the
stick gate configured in the INI, which ensures that the values
sent to the core are appropriately scaled regardless of what
is configured in the INI and makes the overlay look nicer
if the INI is set to a stick gate that matches the graphics.
While manually capturing constexpr variables used in lambda
expressions does work, it's really easy to forget doing so since
we don't have a Windows CMake builder and the workaround isn't
necessary anywhere else. Fortunately, MSVC has a flag that fixes
the constexpr capture behavior, so let's use that instead.
Fixes https://bugs.dolphin-emu.org/issues/12066.
I must've only tested the frame counter with an earlier version
of the PR that broke this, not the final version...
HostIsInstructionRAMAddress uses XCheckTLBFlag::OpcodeNoException,
so we should also use XCheckTLBFlag::OpcodeNoException when reading,
to ensure that we use the IBAT (as opposed to the DBAT) for both.
Doesn't support triggering interrupts when the thermal threshold is
exceeded, but allows polling for temperature information.
The THRM[123] registers are documented in most PPC datasheets, see e.g.
this PPC750CX one: http://datasheets.chipdb.org/IBM/PowerPC/750/750cx_um3-17-05.pdf
PURGE isn't especially useful, while requiring some annoying
special handling in the file format. If you want no compression,
use NONE. If you want fast compression, use Zstandard.
Gets rid of the need to seek to the end of the file
when opening a file.
The downside of this is that we waste a little space,
since we can't know in advance exactly how much
space the compressed parts of the headers will need.
This is useful for the way Dolphin scrubs Wii discs.
The encrypted data is what gets zeroed out, but this
zeroed out data then gets decrypted before being stored,
and the resulting data does not compress well.
However, each block of decrypted scrubbed data is
identical given the same encryption key, and there's
nothing stopping us from making multiple group entries
point to the same offset in the file, so we only have
to store one copy of this data per partition.
For reference, wit zeroes out the decrypted data,
but Dolphin's WIA writer can't do this because it currently
doesn't know which parts of the disc are scrubbed.
This is also useful for things such as storing Datel discs
full of 0x55 blocks (repesenting unreadable blocks)
without compression enabled.
Fixes https://bugs.dolphin-emu.org/issues/10654.
To quote the documenation file included with the program tgctogcm:
"TGC's are miniaturized .gcm images with a 32kB header.
The embedded gcm contains some bogus data, namely:
-FST Location (0x424 in gcm)
-DOL Location (0x420 in gcm)
-FST File offsets (all files are offset/spoofed by a certain amount)"
Dolphin has been handling the values at 0x420 and 0x424 by simply
overwriting them with a working value (just like tgctogcm does),
but it has used a different approach for the file offsets in the FST.
Instead of changing the offsets that are stored in the FST, Dolphin
changed where the files actually are placed on the virtual disc.
My hope was that this would make the loading times more accurate to
how they are when running a TGC file as part of a larger disc.
However, there are TGC files where we would need to move files
backwards on the disc in order to do this (this is what issue
10654 is about), so the approach we have been using is flawed.
This change makes Dolphin overwrite offsets in the FST instead, like
tgctogcm does. Other than making Dolphin handle the affected TGC files
correctly, this change also makes it so that unnecessary padding data
isn't written if you use Dolphin to convert a TGC file to an ISO file.
This feature is not actually implemented in Dolphin as of now, but I'm
planning to add it in the near future as part of a larger feature.
This is intended to catch WIA files which have been created using
wit's default parameters (40 MiB block size), once the WIA PR is
merged. The check does however also work for GCZ files – not that
I think anyone has a GCZ file with a block size that large.
There was a race condition between two PRs incrementing the
array size. CI didn't catch it because the PR that was merged
last (PR #8824) wasn't rebuilt after the first PR was merged.
canonicalPath is orders of magnitude slower as it has to perform actual
disk I/O to resolve symlinks, which makes sorting by this column
ridiculously slow for large game lists, especially if the games are on
a NAS. We probably don't need that, simply resolving relative paths
should be sufficient.
std::result_of is deprecated in C++17, and removed in C++20. Microsoft
has gone ahead with the removal as of Visual Studio 16.6.0, so before
this change our code is broken there.
These are only ever used with ShaderCode instances and nothing else.
Given that, we can convert these helper functions to expect that type of
object as an argument and remove the need for templates, improving
compiler throughput a marginal amount, as the template instantiation
process doesn't need to be performed.
We can also move the definitions of these functions into the cpp file,
which allows us to remove a few inclusions from the ShaderGenCommon
header. This uncovered a few instances of indirect inclusions being
relied upon in other source files.
One other benefit is this allows changes to be made to the definitions
of the functions without needing to recompile all translation units that
make use of these functions, making change testing a little quicker.
Moving the definitions into the cpp file also allows us to completely
hide DefineOutputMember() from external view, given it's only ever used
inside of GenerateVSOutputMembers().
A very trivial conversion, this simply converts calls to Write over to
WriteFmt and adjusts the formatting specifiers as necessary.
This also allows the const char* parameters to become std::string_view
instances, allowing for ease of use with other string types.
The include for X11Utils.h (and by extension Xlib.h) is gated behind
HAVE_XRANDR, as well as the declaration for this function, but its
definition was mistakenly gated behind HAVE_X11. Therefore, if we have
X11 but not Xrandr, the build will fail due to declaration/definition
mismatch and the missing Window type.
CopyNandFile must not create empty files on the destination filesystem
if the source file doesn't exist.
Otherwise, this can lead to an empty Mii database being created in the
session Wii root if there's no database in the configured Wii root and
netplay or Movie is used -- that database would then be copied back to
the configured root, which causes games like MKW to complain about
corrupted Mii data even when the player has stopped using netplay.
This commit also simplifies CreateFullPath usage.
There's no need to manually extract the directory from the path,
FS::CreateFullPath does it automatically just like File::CreateFullPath
Remove the warning:
warning: offsetof within non-standard-layout type ‘JitBlock’ is conditionally-supported
JitBlock contains non-trival types now. Split the fields with trival
types that needs to be access from JIT code into JitBlockData structure.
`std::abs(x - y)` where x and y are unsigned integers fails to compile
with an "call of overloaded 'abs(unsigned int)' is ambiguous" error
on GCC, and even if it did compile, that expression still wouldn't
give the correct result since `x - y` is unsigned.
Lambda expressions with uncaptured constants were leading to errors,
and there were also some warnings about deprecated functions
(QFontMetrics::width and inet_ntoa).
Panic alerts don't use fixed width fonts, and translators are
unlikely to preserve the exact spacing unless they are given
specific instructions to do so and are willing to fight against
the Transifex interface a bit.
We must not provide the /Externals directory as global include directory.
Here, this yield a crash because of external minizip header and system library mismatch.
Soundtouch itself recormends to include it with <SoundTouch.h> and -I/usr/include/soundtouch, so this should fit better.
It actually maps to postMtxInfo, not posMtxInfo (which isn't a thing).
This is especially confusing because there *are* position matrices (as
opposed to post-transform matrices).
Changed several enums from Memmap.h to be static vars and implemented Get functions to query them. This seems to have boosted speed a bit in some titles? The new variables and some previously statically initialized items are now initialized via Memory::Init() and the new AddressSpace::Init(). s_ram_size_real and the new s_exram_size_real in particular are initialized from new OnionConfig values "MAIN_MEM1_SIZE" and "MAIN_MEM2_SIZE", only if "MAIN_RAM_OVERRIDE_ENABLE" is true.
GUI features have been added to Config > Advanced to adjust the new OnionConfig values.
A check has been added to State::doState to ensure savestates with memory configurations different from the current settings aren't loaded. The STATE_VERSION is now 115.
FIFO Files have been updated from version 4 to version 5, now including the MEM1 and MEM2 sizes from the time of DFF creation. FIFO Logs not using the new features (OnionConfig MAIN_RAM_OVERRIDE_ENABLE is false) are still backwards compatible. FIFO Logs that do use the new features have a MIN_LOADER_VERSION of 5. Thanks to the order of function calls, FIFO logs are able to automatically configure the new OnionConfig settings to match what is needed. This is a bit hacky, though, so I also threw in a failsafe for if the conditions that allow this to work ever go away.
I took the liberty of adding a log message to explain why the core fails to initialize if the MIN_LOADER_VERSION is too great.
Some IOS code has had the function "RAMOverrideForIOSMemoryValues" appended to it to recalculate IOS Memory Values from retail IOSes/apploaders to fit the extended memory sizes. Worry not, if MAIN_RAM_OVERRIDE_ENABLE is false, this function does absolutely nothing.
A hotfix in DolphinQt/MenuBar.cpp has been implemented for RAM Override.
Utilizing constexpr, we can eliminate the need to construct the tables
at runtime and just do all the work at compile-time. Making for less
moving parts overall.
The general structure is more or less the same, however rather than one
single initialization function, each table is built off an immediately
executed lambda function. This is nice, since it narrows the scope of
the table building logic down to the tables that actually need it.
While having motion control emulation of IR enabled by default
makes sense in situations like using a DualShock 4 on a PC,
Android has the additional option of touch emulation of IR
which seems to be better liked, and the default value which
was chosen with PC in mind was carried over to Android
without any particular consideration. This change disables
motion control emulation of IR by default on Android only.
The code was actually already rather well adapted for this.
We more or less just have to skip ParseDisc and run
ParsePartitionData directly. This required the PartitionHeader
struct to be removed (which wasn't that useful anyway).
If we start 31 KiB into a 32 KiB block and want to mark 2 KiB
of data as used, we need to mark 2 blocks as used, not just 1.
This problem is avoided when calling MarkAsUsed from
MarkAsUsedE, since MarkAsUsedE aligns to 32 KiB on its own.
Most calls to MarkAsUsed are from MarkAsUsedE, which is why
this hasn't been a noticeable problem in the past.
The constant DESIRED_BUFFER_SIZE was determined by multiplying the
old hardcoded value 32 with the default GCZ block size 16 KiB.
Not sure if it actually is the best value, but it seems fine.
Similar to what we do for addx. Since we're calculating b - a and
because subtraction is not communitative, we can only apply this when
source register a holds the constant.
Before:
45 8B EE mov r13d,r14d
41 83 ED 08 sub r13d,8
After:
45 8D 6E F8 lea r13d,[r14-8]
We can get away with skipping the addition when we know we're dealing
with a constant zero. Just a MOV will suffice in this case.
Once again, we don't bother to add separate handling for when overflow
is needed, because no titles would ever hit that path during my testing.
Before:
8B 7D F8 mov edi,dword ptr [rbp-8]
83 C7 00 add edi,0
After:
8B 7D F8 mov edi,dword ptr [rbp-8]
ADD has a smaller encoding for immediates that can be expressed as an
8-bit signed integer (in other words, between -128 and 127). MOV lacks
this compact representation.
Since addition allows us to swap the source registers, we can always get
the shortest sequence here by carefully checking if we're dealing with a
small immediate first. If we are, move the other source into the
destination and add the small immediate onto that. For large immediates
the reverse is preferrable.
Before:
41 BE 40 00 00 00 mov r14d,40h
44 03 75 A8 add r14d,dword ptr [rbp-58h]
After:
44 8B 75 A8 mov r14d,dword ptr [rbp-58h]
41 83 C6 40 add r14d,40h
Before:
44 8B 7D F8 mov r15d,dword ptr [rbp-8]
41 81 C7 00 68 00 CC add r15d,0CC006800h
After:
41 BF 00 68 00 CC mov r15d,0CC006800h
44 03 7D F8 add r15d,dword ptr [rbp-8]
When the source registers are a simple register and a constant zero and
overflow isn't needed, emitting LEA is kinda silly.
This will occasionally save a single byte for certain registers due to
how x86 encoding works. More importantly, LEA takes up execution
resources while MOV does not.
Before:
41 8D 7D 00 lea edi,[r13]
After:
41 8B FD mov edi,r13d
When the destination register matches a source register, the other
source register contains zero, and overflow isn't needed, the
instruction becomes a nop and we don't need to emit anything.
We could add specialized handling for the case where overflow is needed,
but none of the titles I tried would hit this path.
Before:
83 C7 00 add edi,0
After:
No functional change, just simplify some repeated logic in the case
where we're dealing with exactly one immediate and one simple register
when overflow isn't needed.