This fixes various texture offsetting issues with negative texture coordinates (bringing the software renderer in line with the hardware renderers). It also handles the invalid wrap mode accurately (as was done for the hardware renderers in the previous commit). Lastly, it handles wrapping with non-power-of-2 texture sizes in a hardware-accurate way (which is somewhat broken looking, as games aren't supposed to use wrapping with non-power-of-2 sizes); this has not been done for the hardware renderers.
A voice is considered running if and only if `running` equals 1,
not if `running` is not equal to 0.
This fixes https://bugs.dolphin-emu.org/issues/12508 because for some
reason *The Sims 2 - Castaway* sets `running` to 8 when a stream
finishes playing; previously our AX HLE would just loop the voice
and eventually crash after accessing invalid memory addresses.
Thanks to JMC47 and delroth's help, I've verified that this is the
correct check for the following ucodes:
GC:
* 0x3ad3b7ac
* 0x3daf59b9
* 0x4e8a8b21
* 0x07f88145
* 0xe2136399
* 0x3389a79e
Wii:
* 0x347112ba
* 0xfa450138
* 0xadbc06bd
And while I was fixing the running check, I noticed that the is_stream
field was also being handled incorrectly, so I've fixed that as well.
Putting AX functions from AXVoice.h in an anonymous namespace does
successfully prevent compilers from merging those functions and
allows us to avoid ODR violations.
However, tools such as gdb still mix up AX GC and AX Wii functions
and variables because those have the exact same symbol names.
This can be fixed by using inline namespaces which are transparent
at the source code level but forces AX GC and AX Wii symbols to be
different.
The fast path of using CVTSD2SS/FCVTN rounds the significand if it
can't be exactly represented as a single, whereas the accurate path
instead truncates the significand. So we should only use the fast
path if we know that the lower bits of the significand are not set.
This is not known to affect any games.
Passing a width of 64 and registers encoded as double to
DUP resulted in an invalid instruction. The registers should
be encoded as quads in this situation.
Fixes https://bugs.dolphin-emu.org/issues/12575.
Manually encoding and decoding logical immediates is error-prone.
Using ORRI2R and friends lets us avoid doing the work manually,
but in exchange, there is a runtime performance penalty. It's
probably rather small, but still, it would be nice if we could
let the compiler do the work at compile-time. And that's exactly
what this commit does, so now I have no excuse for trying to
manually write logical immediates anymore.
If a branch is unconditional, its target should not be in farcode,
since that defeats the purpose of farcode (putting seldom executed
code in farcode to keep it out of the icache when possible).
Fixes a 58698b8380 regression. (The EXCEPTION_EXTERNAL_INT
immediate being wrong meant that we never took the branch,
masking the problem of the MSR.EE immediate being wrong...)
In cases where we already know that there is an exception,
either because we just checked for it or because we were
the ones that generated the exception to begin with,
we can skip the branch inside WriteExceptionExit.
Unlike most constants we emit in JitArm64, these constants are
*not* inherent to the CPU we're emulating, and can have whatever
values we want. Let's handle them more robustly, in case we
decide to change their values in the future.
Public domain does not have an internationally agreed upon definition,
As such it's generally preferred to use an extremely liberal license,
which can explicitly list the rights granted by the copyright holder.
The CC0 license is the usual choice here.
This "relicensing" is done without hunting down copyright holders, since
it is presumed that their release of this work into the public domain
authorizes us to redistribute this code under any other license of our
choosing.
This code was part of Dolphin's relicensing from v2 to v2+ a while back,
we just never updated these copyright headers. I double-checked that
segher gave us permission to relicense this code to v2+ on 2015-05-16.
SPDX standardizes how source code conveys its copyright and licensing
information. See https://spdx.github.io/spdx-spec/1-rationale/ . SPDX
tags are adopted in many large projects, including things like the Linux
kernel.
This broke ejecting Wii discs while the game is running, as the drive state was set to Ready even when no disc was present, but other code still reported the missing disc, which confused games as you can't be both ready to read and have no disc. That would cause games to show an unrecoverable error screen, instead of a "please insert the game disc" screen.
This only affected Wii games; the GameCube games used regular disc reads which worked fine.
Performance optimization, along with making the code a little
neater. Saves us from performing a single -> double -> single
conversion when calling UpdateFPRFSingle.
If we already have to use a GPR, we might as well take advantage
of the nice immediate encodings provided by GPR ORR. This is
faster, smaller, and saves a register.
Some of the code used when the carry flag is known to be a
constant value is really not much better than just setting
the carry flag and then using the normal code, and with how
rarely this code runs, it isn't well tested either.
Might as well get rid of some of this code and simplify things.
These optimizations were already present, but only when d == a. They
also make sense when this condition does not hold.
- imm == 0
Before:
41 BB 00 00 00 00 mov r11d,0
45 2B DF sub r11d,r15d
After:
45 8B DF mov r11d,r15d
41 F7 DB neg r11d
- imm == -1
Before:
41 BD FF FF FF FF mov r13d,0FFFFFFFFh
44 2B EE sub r13d,esi
0F 93 45 68 setae byte ptr [rbp+68h]
After:
44 8B EE mov r13d,esi
41 F7 D5 not r13d
C6 45 68 01 mov byte ptr [rbp+68h],1
Without this, the code added in ac28b89 misbehaves and considers
AArch64 netplay clients to not have hardware FMA support, telling
all clients to disable FMA support, which causes a desync between
x64 and AArch64 due to JitArm64 not being able to disable FMA support.
Fixes a regression from 5.0-12066, where setting the GFXBackend variable
to one other than the current global backend would crash Dolphin upon
launching the game.
fcmpX only updates the FPCC bits, not the C bit.
This was already correctly implemented in the interpreter.
Not known to affect any games, but affects a hardware test.
This fixes bounding box shaders failing to compile under Vulkan, due to
differences between GLSL and HLSL in the return value of vector
comparisons and what types these functions accept. I included all() for
the sake of completeness.
At higher resolutions, our bounding box dimensions end up being
slightly larger than original hardware in some cases. This is not
necessarily wrong, it's just an artifact of rendering at a higher
resolution, due to bringing out detail that wouldn't have appeared on
original hardware. It causes a texel to fall partially on what would
have been a single pixel at native resolution, resulting in the
coordinates getting bumped up to the next valid value. In many cases,
these slightly larger bounding boxes are perfectly fine, as games don't
hard-code expected dimensions. It is problematic in Paper Mario TTYD
though, for a somewhat complicated reason.
Paper Mario TTYD frequently uses EFB copies to pre-render a bunch of
animation frames for a character sprite (especially in Chapter 2), so
that it can then render 100 or more of them without bringing the
GameCube to its knees. Based on my observation, the game seems to set
aside a region of memory to store these EFB copies. This region is
obviously fairly small, as the GameCube only has 24MB of RAM. There are
2 rooms in Chapter 2 where you fight a horde of as many as 100 Jabbies,
which are also rendered using EFB copies, so in this room the game ends
up making 130(!) EFB copies just for Puni and Jabbi sprites. This seems
to nearly fill the region of memory it set aside for them.
Unfortunately, our slightly larger bounding boxes at higher resolutions
results in overflowing this memory, causing very strange behavior. Some
EFB copies partially overlap game state, resulting in reading it as a
garbage RGB5A3 texture that constantly changes. Others apparently
somehow trigger a corner case in our persistent buffer mapping, causing
them to partially overwrite earlier EFB copies.
What this change does is only include the screen coordinates that align
with the equivalent native resolution pixel centers, which generally
results in the bounding boxes being more in line with original
hardware. It isn't perfect, but it's enough to fix Paper Mario TTYD's
Jabbi rooms by avoiding the buffer overflow. Notably, it is more
accurate at odd resolutions than at even resolutions. Native resolution
is completely unaffected by this change, as should be the case. This
change may also have a small positive impact on shader performance at
higher resolutions, as there will be less atomic operations performed.
Not doing this can cause desyncs when TASing. (I don't know
how common such desyncs would be, though. For games that
don't change rounding modes, they shouldn't be a problem.)